Starting a Prince of Persia port...

Got a programming project in mind? Tell everyone about it!
User avatar
kieranhj
Posts: 452
Joined: Sat Sep 19, 2015 10:11 pm
Location: Farnham, Surrey, UK

Re: Starting a Prince of Persia port...

Postby kieranhj » Fri Sep 08, 2017 3:41 pm

Rich Talbot-Watkins wrote:My gut instinct on this is that the only bit which you can probably truly 'port' is the game/animation logic, basically all the high-level stuff. The graphics routines are going to need to get a total rewrite to conform more with the Beeb hardware.

My take on it would be to bite the bullet and go to MODE 2 (or 1). You could keep the tile size the same, and store the sprites in a more native format; so 10 tiles of 14 (or 28) pixels wide, which fits nicely on a byte boundary still, and allows for a 70 column wide screen. Perhaps I haven't grasped quite how low memory is! Obviously doing that would double the size of the sprite data (and it's already doubled the size of the screen), but I suspect it'd be possible to compress the sprites, even with a RLE type method. I would find a way to avoid having to do double-buffering so you can at least use a shadow screen, and keep main RAM for non-graphics related stuff.

Do you have a basic overview of the memory needs for PoP? In honesty, I'm struggling to imagine how it can be so demanding - unless it's just that there are a lot of frames of animation.

I think you might be right Rich. The graphics routines will need a total rewrite but thankfully is a very small part of the overall codebase. There is a massive amount of gameplay code which should stay completely intact. All of this is in Aux (SHADOW) RAM at the moment, with Main RAM being used for the screen buffers and rendering code.

On the sprite front, this is from my notes for what needs to be permanently resident in SWRAM:

Code: Select all

BANK0 = BGTAB1.XXX + BGTAB2.XXX = 9185b + 4593b = 13778b                  <-- Level BG
BANK1 = CHTAB1 + CHTAB3 = 9165b + 5985b = 15150b                          <-- Player
BANK2 = CHTAB2 + CHTAB5 = 9189b + 6134b = 15323b                          <-- Player
BANK3 = CHTAB4.XXX + CHTAB6.X + CHTAB7 = 5281b + 9201b + 1155b = 15637b   <-- Guard + Princess + Boss

Total = 59888b = ~ 58.5KB
Free = 5648b = ~ 5.5KB

This is sprite data in Apple II pixel format. Pretty sure CHTAB6&7 are only required for cutscenes so these could be loaded on demand.
From my brief tests this data gets bloated by minimum 50% in MODE 1 form (2bpp, uncompressed of course.) Not all of the sprites are required, for example some of the larger ones are actually text message boxes!

Current memory usage:

Code: Select all

Core ORG =  &E00
Core lib size =  &39D
Core code size =  &FCF
Core data size =  &E
Core BSS size =  &CE3
Core high watermark =  &2E5D
Core RAM free =  &1A3

Main ORG =  &3000
Main code & data size =  &1237
Main high watermark =  &4237
Screen buffer address =  &65C0 (280x192 MODE 4)
Main RAM free =  &2389

Aux ORG =  &3000
Aux code & data size =  &4030
Aux level blueprint size =  &900
Aux high watermark =  &7A00
Aux RAM free =  &600

Yes, there really is 16KB of just gameplay code in Aux and I haven't finished porting it all yet. Basically the entire game is hand-coded for every player action with lots of (for the time) sophisticated and subtle control behaviours - reminds me a lot of my first N64 game actually. :)

I'm losing a couple of pages to alignment and there are at least 8 pages I can steal from the OS for BSS.

I think I'm going to proceed by trying the approach of converting Apple II into native MODE 4 for now, mostly because I've already written the converter and it keeps all the memory requirements about the same. I can figure out exactly what the expected behaviour of the sprite plotting functions are before worrying about shifting colour format pixels around efficiently.

User avatar
Rich Talbot-Watkins
Posts: 1049
Joined: Thu Jan 13, 2005 5:20 pm
Location: Palma, Mallorca

Re: Starting a Prince of Persia port...

Postby Rich Talbot-Watkins » Fri Sep 08, 2017 5:09 pm

Hmm yeah, the graphics data is pretty bulky!

From my best understanding, the scenery tiles are an awkward size on the Apple (28x63), and are rendered in a complicated way. In particular, what intrigues me is the way the floors are rendered:

Image

That overlapping bit is a true horror! Given that the Apple sprite data is 1bpp, there doesn't seem to be any way you could infer a sprite mask from that. So does it have a separate sprite mask? If so, that's great news as it means that the data size is no different from a 2bpp or 4bpp native format sprite. It would need a mask when plotting B and A; either that or it's a constant triangle mask which is applied when plotting B (so C is not overwritten), and it's an optional mask applied when plotting A (depending on whether it's mostly wall, or just a floor). Any idea how the masking works?

I guess this diagram is just a rough overview though. I'd assume it's more clever than that, and breaks up the tiles even more so that it can avoid storing and rendering large runs of blank.

Here are some thoughts for clawing back some memory (if expanding out the graphics to 2 or 4bpp):

* If using MODE 2, store the graphics 2bpp, thus giving the same memory footprint as the MODE 4 graphics. Then use the 'Exile' type approach of rendering them in any 4 colours (or 3 + transparent) of your choosing - I don't think the graphical style would make this limiting, and you could still get all 8 colours on screen.

* Use the top bit of each pixel as a foreground/background indicator. I think the top bit could be inferred from which section of the scenery tile it is (section A always foreground, the rest background?), so you wouldn't have to store it either. Then you could have the character plotting routines look at the top screen bits as a mask. Just save the screen data prior to plotting the character (there will be shadow RAM spare for that), and erase the character by restoring the screen data - nice and quick, and no need for double buffering.

* Long shot: LZ77 compress the scenery graphics in an entire block, and expand them out to the screen buffer (blanked) between screens, copying out only the tiles used by that particular screen to a cache. This approach could also work for the NPC graphics which are not all needed simultaneously (I think). I don't think RLE would give big gains. Neither might this approach of course, if the LZ compressed data + required cache is bigger than the unexpanded graphics.

* Break up the scenery tiles into smaller units which allow for some reuse.

* If we have mirrored sprites for the characters, ditch them and use a mirroring table instead to reverse the bytes (and obviously a different routine to plot them backwards).

Not sure how feasible any of that is, but just want to do a brain dump in case anything there sparks any thoughts of your own :D

User avatar
kieranhj
Posts: 452
Joined: Sat Sep 19, 2015 10:11 pm
Location: Farnham, Surrey, UK

Re: Starting a Prince of Persia port...

Postby kieranhj » Mon Sep 11, 2017 12:38 pm

Rich Talbot-Watkins wrote:Hmm yeah, the graphics data is pretty bulky!

From my best understanding, the scenery tiles are an awkward size on the Apple (28x63), and are rendered in a complicated way. In particular, what intrigues me is the way the floors are rendered:

That overlapping bit is a true horror! Given that the Apple sprite data is 1bpp, there doesn't seem to be any way you could infer a sprite mask from that. So does it have a separate sprite mask? If so, that's great news as it means that the data size is no different from a 2bpp or 4bpp native format sprite. It would need a mask when plotting B and A; either that or it's a constant triangle mask which is applied when plotting B (so C is not overwritten), and it's an optional mask applied when plotting A (depending on whether it's mostly wall, or just a floor). Any idea how the masking works?

I guess this diagram is just a rough overview though. I'd assume it's more clever than that, and breaks up the tiles even more so that it can avoid storing and rendering large runs of blank.

Everything is drawn with painters algorithm, so back to front and overdrawn where necessary. A screen is made up of 10x3 pieces drawn bottom to top and left to right. Where drawing a piece is: (m versions are for the moveable pieces)

Code: Select all

\*-------------------------------
\*
\*  Redraw entire block
\*
\*-------------------------------
.RedBlockSure
{
 jsr drawc ;C-section of piece below & to left
 jsr drawmc

 jsr drawb ;B-section of piece to left
 jsr drawmb

 jsr drawd ;D-section
 jsr drawmd

 jsr drawa ;A-section
 jsr drawma

 jmp drawfrnt ;A-section frontpiece
;(Note: This is necessary in case we do a
;layersave before we get to f.g. plane)
}

Note that this doesn't actually draw the sprites there and then, they are placed into image lists - background, foreground and mid - which are then plotted in later on at the render stage:

Code: Select all

\*-------------------------------
\*
\*  D R A W A L L
\*
\*  Draw everything in image lists
\*
\*  This is the only routine that calls HIRES routines.
\*
\*-------------------------------
.DRAWALL
{
 jsr DOGEN ;Do general stuff like cls

 lda blackflag ;TEMP
 bne label_1 ;

 jsr SNGPEEL ;"Peel off" characters
;(using the peel list we
;set up 2 frames ago)

.label_1 jsr ZEROPEEL ;Zero just-used peel list

 jsr DRAWWIPE ;Draw wipes

 jsr DRAWBACK ;Draw background plane images

 jsr DRAWMID ;Draw middle plane images
;(& save underlayers to now-clear peel list)

 jsr DRAWFORE ;Draw foreground plane images

 jmp DRAWMSG ;Draw messages
}

Each sprite can be plotted at any screen position, with clipping, mirroring and with operand STA, OR, AND (mask) and special (shift & XOR for the "ghost" player character.) As the current plotting is quite slow you can watch it assemble screens in each layer from the SSD's earlier in the thread.

There is no separate sprite mask data. It does infer a mask when required from this table: (note these are Apple II bits so "back to front" with lsb being left-most pixel)

Code: Select all

*-------------------------------
\*
\* MASKTAB
\*
\* Index: byte value w/hibit clr (0-127)
\* Returns mask byte w/hibit set
\*
\*-------------------------------

.MASKTAB
 EQUB $FF,$FC,$F8,$F8,$F1,$F0,$F0,$F0
 EQUB $E3,$E0,$E0,$E0,$E1,$E0,$E0,$E0
 EQUB $C7,$C4,$C0,$C0,$C1,$C0,$C0,$C0
 EQUB $C3,$C0,$C0,$C0,$C1,$C0,$C0,$C0

 EQUB $8F,$8C,$88,$88,$81,$80,$80,$80
 EQUB $83,$80,$80,$80,$81,$80,$80,$80
 EQUB $87,$84,$80,$80,$81,$80,$80,$80
 EQUB $83,$80,$80,$80,$81,$80,$80,$80

 EQUB $9F,$9C,$98,$98,$91,$90,$90,$90
 EQUB $83,$80,$80,$80,$81,$80,$80,$80
 EQUB $87,$84,$80,$80,$81,$80,$80,$80
 EQUB $83,$80,$80,$80,$81,$80,$80,$80

 EQUB $8F,$8C,$88,$88,$81,$80,$80,$80
 EQUB $83,$80,$80,$80,$81,$80,$80,$80
 EQUB $87,$84,$80,$80,$81,$80,$80,$80
 EQUB $83,$80,$80,$80,$81,$80,$80,$80

Rich Talbot-Watkins wrote:Here are some thoughts for clawing back some memory (if expanding out the graphics to 2 or 4bpp):

* If using MODE 2, store the graphics 2bpp, thus giving the same memory footprint as the MODE 4 graphics. Then use the 'Exile' type approach of rendering them in any 4 colours (or 3 + transparent) of your choosing - I don't think the graphical style would make this limiting, and you could still get all 8 colours on screen.

I like this idea a lot and will definitely look into it once I've got the B&W version rendering correctly. The input parameters for the sprite plot routines are well documented so just up to me to write the necessary routines (just quite a few variations and fiddly features to consider, including clipping.) Again the code is well structured so it should be possible to just alter the render function that handles conversion of object X coordinates into plot coordinates without affecting the gameplay for different screen widths. NB. Game objects have a different coordinate system from the background tiles.

Rich Talbot-Watkins wrote:* Use the top bit of each pixel as a foreground/background indicator. I think the top bit could be inferred from which section of the scenery tile it is (section A always foreground, the rest background?), so you wouldn't have to store it either. Then you could have the character plotting routines look at the top screen bits as a mask. Just save the screen data prior to plotting the character (there will be shadow RAM spare for that), and erase the character by restoring the screen data - nice and quick, and no need for double buffering.

PoP actually already saves and restores screen data prior to plotting mid-ground objects. The LAYRSAVE and PEEL functions are designed to do this and copy the data out into (double) buffers using the same format as the sprite tables (so can use the same routines to be put back.) I haven't implemented the Beeb equivalents yet but have earmarked the 4K of MOS RAM at &8000 for this purpose.

Rich Talbot-Watkins wrote:* Long shot: LZ77 compress the scenery graphics in an entire block, and expand them out to the screen buffer (blanked) between screens, copying out only the tiles used by that particular screen to a cache. This approach could also work for the NPC graphics which are not all needed simultaneously (I think). I don't think RLE would give big gains. Neither might this approach of course, if the LZ compressed data + required cache is bigger than the unexpanded graphics.

* Break up the scenery tiles into smaller units which allow for some reuse.

Compression and/or caching could be an option. The game already uses two different sprite banks (Dungeon & Palace) for background tiles & selects one per level (there are actually three in the code so note sure if there is technically a third bank made up of a combination of the other two - I need to investigate further later on.) It can also be noted that from any given screen there are only four possible next screens (L/R/U/D) so in theory we can (pre)calculate all possible next tiles/pieces that we haven't already rendered if caching. (Also those that are no longer reachable.)

There is only one NPC (Guard) type permitted per level and again this is selected & loaded on a per-level basis. The Princess & bad guy (Vizer) are only used in cutscenes (apart from the final boss fight.) The nuclear option would be to halve the number of animation frames for the player but that would be a shame.

Rich Talbot-Watkins wrote:* If we have mirrored sprites for the characters, ditch them and use a mirroring table instead to reverse the bytes (and obviously a different routine to plot them backwards).

Not sure how feasible any of that is, but just want to do a brain dump in case anything there sparks any thoughts of your own :D

Again, PoP already mirrors the sprites in software using a separate set of plot functions in this instance.

There are some great thoughts in here Rich, as ever, thank you! I am continuing to explore just using converted Beeb 1bpp data in MODE4 so I can hopefully get the plotting functionality correct without any additional complications. Once this is working I can explore optimisations and the right approach to colour (MODE1 vs MODE2) and enlist the help of our resident artist if needs be.

User avatar
kieranhj
Posts: 452
Joined: Sat Sep 19, 2015 10:11 pm
Location: Farnham, Surrey, UK

Re: Starting a Prince of Persia port...

Postby kieranhj » Wed Sep 20, 2017 1:02 pm

Apologies all for being a bit quiet of late. I have been plugging away at PoP but reached various impasses / decision-forks so have been exploring (and still exploring them) and found it harder to post concrete updates.

Essentially I implemented ~most~ of the standard sprite plot routine as PoP requires, including per-pixel placement and masking but not clipping or mirroring, in MODE 4 using sprite data converted to 8 pixels per byte format ahead of time and lots of shift & carry tables. I also implemented the sprite save & restore code using MOS 4K RAM as a buffer. The end result, whilst looking good from a level background POV was disheartening - just too damn slow.

FWIW I've attached an SSD image if you want to see it in action (if 1 fps is action.) Please be patient - it takes quite a while to load all of the various data files and will require all 4 banks of swram + shadow (Master only.) Or checkout online: https://bitshifters.github.io/jsbeeb/?disc=https://bitshifters.github.io/content/wip/pop-beeb-slow-sprites-mode4.ssd&autoboot&model=Master. There's no keyboard input implemented so don't try and move!

pop-beeb-slow-sprites-mode4.png
Sprites are now masked

I was starting the implementation of mirrored sprites but the whole thing was getting very complicated and relies on a mountain (about 4k) of tables and requires all sorts of careful masking of bits to get 8 pixels per byte data plotted at 1 pixel alignment but in 7 pixel increments. PoP itself uses lots of masking but now everything needs to be masked twice - once for the sprite data as PoP sees it then again for how the Beeb screen sees it. I also couldn't see how the main sprite plotting loop was going to be optimised using the current approach - saving the odd cycle here & there wasn't going to help.

I stopped for a pause and a rethink and have started to explore Rich's suggestion of using 2bpp sprite data but at half the horizontal resolution, so MODE 5 data effectively, but then expanded out at runtime to plot in MODE 2. Although I liked the hi-res MODE1/4 look there are a number of advantages, not least the possibility of "free" masking using appropriate palette mapping. Given the limitations of the original game, having 3 colours for foreground sprites and 4 colours for backgrounds should be plenty.

I've been trawling through various sprite plot implementations and particularly the Exile code disassembly. I'm essentially now in the process of copying that technique (with ZP optimisation to come later.) I haven't finished the new sprite routine yet but should be soon.

Another important decision has been to try rounding up everything to be in multiples of 8 pixels rather than 7. This just makes everything so much easier to contend with on the Beeb and means that when halving the horizontal resolution I'm not faced with the smallest sprite size being 3 1/2 pixels wide. This implies that the new resolution is 160x192 in MODE 2 which stretches everything horizontally, so will look a bit "fat". Currently I'm just point sampling the original sprites in my simple Apple II image converter but the results don't look too bad in my simple BASIC image checker:

pop-beeb-kid-sprite-mode5.png
Kid sprite sheet in MODE 5

I figure that if I manage to get this project approaching something like completed from a gameplay POV then I will take up DethMunk's kindly offer of artist support for getting all of the sprites redrawn or at least tidied up by hand.

That's it for now. I will post again when I have the background rendering back up & running in MODE 2 and then onwards to a better sprite plot that is good enough for actual animation at runtime. As always your comments & feedback are appreciated.
Attachments
pop-beeb-slow-sprites-mode4.zip
Very slow sprite plotting in MODE 4
(57.8 KiB) Downloaded 7 times

User avatar
FourthStone
Posts: 369
Joined: Thu Nov 17, 2016 2:29 am
Location: Melbourne, Australia

Re: Starting a Prince of Persia port...

Postby FourthStone » Wed Sep 20, 2017 8:35 pm

Where's the like button on this thread??

Really enjoying this detailed exploration =D>


Return to “projects”

Who is online

Users browsing this forum: No registered users and 1 guest