| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
Profiling tool?
Has anyone come across or written a tool that lets you execute a chunk of 6502 that then reports back a disassembly where each instruction is annotated with the total number of cycles it consumed?
Something like unp64 but with added instrumentation, basically.
Closest a quick google found for me was a tool for generating such reports from logs produced by a hardware bus monitor.. |
|
... 23 posts hidden. Click here to view all posts.... |
| |
Krill
Registered: Apr 2002 Posts: 2980 |
Quoting ChristopherJamtotal cycles spent by that instruction, cumulative percentage of cycles expended, and an ASCII graph of the latter I wonder how log2(total cycles spent by that instruction) and the corresponding ASCII graph (with or without scaling to 64x "#" max.) would look like. I imagine "number of bits to represent number of cycles" to be quite an intuitive curve. :)
Edit: Hmm, that would basically be a scaled and left-justified version of the "total cycles spent by that instruction" (base 10) column. Maybe it should be linear‽ |
| |
Slajerek
Registered: May 2015 Posts: 63 |
Interesting result of the script :)
Yes, lacking of cycle stamp in each log line is a clear overlook by me, apologies :) there's a number of cycles in each jsr/rts marker, so Champ does not need a cycle count per line. Anyway, I definitely have to add better cycle counters handling and logging. |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
Krill: Yes, I graphed cumulative to make it easy to see at a glance how costly sections of code are, but the counts do indeed make it easy to see individual expensive instructions.
five digits bad, three or four digits less so :)
Slajerek: Thanks!
Oh, and no need to apologize; much as I'm looking forward to future enhancements this is already extraordinarily useful. |
| |
Krill
Registered: Apr 2002 Posts: 2980 |
Quoting ChristopherJamKrill: Yes, I graphed cumulative to make it easy to see at a glance how costly sections of code are, but the counts do indeed make it easy to see individual expensive instructions. So you're looking for big slopes (high increase of #) when head-parsing the graph line by line from top to bottom, adjusting for increasing "ratio skew"?
I mean the cumulative graph always starts at 0 and ends at 100% in a strictly ascending manner, no matter the program or per-line cycle distribution. |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
Yes, I look at the slope as a proportion of the final width, rather than current width. Perhaps I should add a right hand border, or a second/lighter fill character between the current total and 100%? (eg ####__)
But yes, even as is you can spot things like the graph going near horizontal at 0a32 and 0a54, and you can also see it going from 1.5% to 89% in the space of 40 or so instructions. |
| |
oziphantom
Registered: Oct 2014 Posts: 490 |
does you log handle, badlines, sprite DMA, IRQs etc, does it just get tacked on to the cost of the instruction, are you able to filter out what is and isn't IRQ etc. |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
This is purely CPU cycles at the moment; I don't have information about VIC bus accesses.
I suspect the only indication I would have that an IRQ has happened would be from parsing the preceding instruction and noting that the next one executed was in an unexpected location; not sure, as I've only tested this with screen blanked and interrupts disabled.
( Slajerek? :D ) |
| |
Krill
Registered: Apr 2002 Posts: 2980 |
Enough to profile "pure" algorithms, converting input to output without side-effects. Which is all you need for now. :) |
| |
Martin Piper
Registered: Nov 2007 Posts: 722 |
Quote: VICE PDB Monitor 1.0 (get the extra files from github as per comments ) has a limited profiler, Martin did it so I'm not a 100% sure how it works. But I think it uses the memmap operations to count how often an address is executed. So its doesn't give you clocks but it does give hotspots. Its open source so if you wish to extend it to be better, feel free ;)
Currently VICE PDB Monitor uses Vice's mmzap and mmshow between executions or breakpoints to gather execution information and generate a heat map. It's not entirely exact since Vice doesn't trap multiple executions from the same address.
I'm pondering adding some profiling syntax and command line arguments to BDD6502 to allow cycle exact profiling a source level debugging of the same.
One thing I would like to support is support for self modifying code where the opcode is changed. This would be useful for very optimised self modifying code. This does however have challenges for regular profile views because cycle counts can change when the opcode is changed. |
| |
Slajerek
Registered: May 2015 Posts: 63 |
@oziphantom noted :) all these are not logged yet, the Champ profile reports do not take these things into account. Will add soon, thanks for pointing this out. |
Previous - 1 | 2 | 3 | 4 - Next |