Skip to main content

SSA Ethnicity

Each SSA value in our intermediate representation (IR), and therefore each IR instruction, is tagged with an ethnicity that indicates the origin or role of that value in the original machine code. Transformation passes may be configured to operate selectively on one or more ethnicities while leaving others unchanged.


Available ethnicities

There are four ethnicity categories:

1. Normal

Default category for instructions and SSA values that do not match any other ethnicity. This is the largest category.

2. Memory Operand

SSA values that participate in the computation of memory operands (address calculations) are assigned this ethnicity.

Example: for the instruction ADD [RAX + 0x10], 0x20, the corresponding lifted IR can be visualized as the following DAG of instructions and data dependencies:


0x10 RAX
\ /
ADD
/ \
| LOAD 0x20
| \ /
| ADD
\ /
STORE

The subgraph that computes the memory address:

   0x10  RAX
\ /
ADD

is assigned the Memory Operand ethnicity.

3. FP-Based Memory Operand

Same as Memory Operand, but used when the memory operand uses a frame pointer register (e.g., RBP on x86-64 or X29 on AArch64). This ethnicity takes precedence over Memory Operand.

4. SP-Based Memory Operand

Same as FP-Based Memory Operand, but used when the memory operand uses a stack pointer register (e.g., RSP on x86-64 or SP on AArch64). This ethnicity takes precedence over both FP-Based Memory Operand and Memory Operand.

Note: Precedence (highest → lowest): SP-Based Memory OperandFP-Based Memory OperandMemory OperandNormal.


Intended uses and rationale

Ethnicities allow selective transformation and obfuscation of IR.

  • When code size or runtime speed is constrained, it is often desirable to avoid transforming IR that corresponds to compiler-generated addressing computations (stack offsets, frame-index calculations, etc.), since those computations typically were not present in the original source code and obfuscating them yields little benefit.

  • Stack and frame addressing computations are commonly inserted by the compiler when spilling temporaries or when using local variables. Because these computations are compilation artifacts rather than original program semantics, they are good candidates to exclude from aggressive obfuscation.

  • Using ethnicities, passes can be configured to:

    • Apply full obfuscation only to Normal and non-stack memory operand IR, while leaving SP-Based and FP-Based Memory Operand IR untouched, or
    • Apply lighter-weight transformations to memory operand IR than to Normal IR.

Practical guidance

  • Excluding SP-Based Memory Operand entirely, and all memory operands when obscuring immediate values, should be the go to option when code size is constrained.
  • Many compilers do not use frame pointers, and instead address directly relative to the stack pointer. As a result, excluding the FP-Based Memory Operand ethnicity may prevent obfuscation from being applied to important semantics.