User Tools

Site Tools


wswan:guide:optimization

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
wswan:guide:optimization [2025/12/31 12:42] – created asiewswan:guide:optimization [2025/12/31 12:50] (current) – [Optimizing for memory usage] asie
Line 1: Line 1:
-====== Writing optimized code ======+====== Optimizing programs ======
  
 This page serves as a loose list of advice for getting the most out of the WonderSwan. This page serves as a loose list of advice for getting the most out of the WonderSwan.
  
-===== Optimizing for speed =====+===== Optimizing C code ===== 
 + 
 +==== Optimizing for code speed ====
  
 To optimize for speed, compile your code with ''%%-O2%%''. To optimize for speed, compile your code with ''%%-O2%%''.
  
-===== Optimizing for size =====+==== Optimizing for code size ====
  
 To optimize for size, compile your code with ''%%-Os%%''. To optimize for size, compile your code with ''%%-Os%%''.
  
-===== Optimizing for memory usage =====+==== Optimizing for memory usage ====
  
   * For data stored in RAM, use the smallest type possible.   * For data stored in RAM, use the smallest type possible.
     * Exception: For argument passing, there is little reason to prefer ''%%char%%'' over ''%%int%%'' - the stack is always aligned to 2 bytes.     * Exception: For argument passing, there is little reason to prefer ''%%char%%'' over ''%%int%%'' - the stack is always aligned to 2 bytes.
-  * By default, GCC allows function call arguments to accumulate on the stack, then pops them all at once. To reduce stack usage at the cost of a larger and slightly slower program, compile your code with ''%%fno-defer-pop%%''.+  * By default, GCC allows function call arguments to accumulate on the stack, then pops them all at once. To reduce peak stack usage at the cost of a larger and slightly slower program, compile your code with ''%%-fno-defer-pop%%''.
  
 ===== Optimizing assembly code ===== ===== Optimizing assembly code =====
 +
 +==== Optimizing for speed ====
  
 While the V30MZ is an 80186-compatible CPU, its instruction timings differ wildly from common expectations and are more reflective of its 1990s-era design: While the V30MZ is an 80186-compatible CPU, its instruction timings differ wildly from common expectations and are more reflective of its 1990s-era design:
Line 29: Line 33:
  
 You can study the instruction timings in detail on [[https://ws.nesdev.org/wiki/NEC_V30MZ_instruction_set|the WSdev wiki]]. You can study the instruction timings in detail on [[https://ws.nesdev.org/wiki/NEC_V30MZ_instruction_set|the WSdev wiki]].
 +
 +There are also some additional tricks you can take advantage of:
 + 
 +  * Avoid far calls between functions - branches are expensive, and far branches are significantly more expensive. If you're calling a far function from another far function in the same section, use the ''%%IA16_CALL_LOCAL%%'' macro over a far call to save a few cycles.
 +  * Try word-aligning loop labels by prepending them with ''%%.align 2, 0x90%%'' - this generates a NOP opcode if necessary. This may help a little.
wswan/guide/optimization.1767184945.txt.gz · Last modified: by asie