How to put assembly instructions inside C code

Only C or only assembly

C code

int f(int x)
{
    return 23 * x + 9;
}

Corresponding assembly code (Intel processor, AT&T syntax).

      .text
      .globl f
      .p2align 4,,15
f:    imull $23,%edi,%eax
      addl  $9,%eax
      ret

AT&T syntax (for a 64-bit Intel/AMD processor)

Register names.

  • 8-bit %al,%ah,%bl,%bh, …, r8b, …, r15b

  • 16-bit %ax,%bx, …, r8w, …, r15w

  • 32-bit %eax,%ebx,..., r8d, …, r15d

  • 64-bit %rax, %rbx, …, r8, …, r15

Constants are preceded by a dollar sign ($).

Destination at the end.

Instruction names include the register size at the end.

  • 8 bits (char) b --- addb

  • 16 bits (short) w --- addw

  • 32 bits (int) I --- addl

  • 64 bits (long) q --- addq

How to put assembly instructions inside C code

C code with an embedded assembly instruction.

unsigned long first_one_bit(unsigned long word)
{
    register unsigned long result;
    asm("bsfq %[data],%[result]"
        : [result] "=r" (result)
        : [data] "r" (word)
    );
    return result;
}

A more complex example.

unsigned long read_time_stamp_counter(void)
{ // read tsc register (MUST be compiled with -m64)
    unsigned long tmp;
    asm volatile
        (
            "rdtsc ; "
            "shlq $32,%%rdx ; "
            "orq %%rdx,%%rax"
            : "=a" (tmp)    /* output operands (a=rax register) */
            :               /* no input operands */
            : "rdx","cc"    /* things that got modified */
        );
    return tmp;
}

The gcc assembly instructions template

To insert assembly instructions inside a C function use the asm keyword as follows.

asm [volatile]
(
    assembler_template
    : output_operands
    : input_operands
    : clobbers
);

The volatile keyword tells the compiler that the assembly code should not be moved (otherwise, during the optimization phase the compiler may place it in an unintended place.

Details

The assembler_template is a string containing the assembly source code.

  • a pattern of the form %%reg refers to the specific register reg.

  • a pattern of the form %[name] refers to a register holding one input or output argument (the compiler chooses the register that will be used).

output_operands is a comma-separated list, possibly empty, of output or input/output parameters.

  • each output parameter has the form [name] constraint_string (lvalue) where constraint_string can be (incomplete list):

    • "=r", meaning that the output is stored in a register (the register can be used in an unrelated input).

    • "=&r", meaning that the output is stored in a register (the register can not be used as an input, early clobber).

    • "+r", meaning the argument is used as input and output, stored in a register.

input_operands is a comma-separated list, possibly empty, of input parameters.

  • each input parameter has the form [name] constraint_string (C_expression), where constraint_string can be (among other possibilities).

    • "r", meaning that the input is stored in a register.

    • "m", meaning that the input is stored in a memory position.

  • note that input-only operands MUST NOT be modified by the assembly code.

clobbers is a comma-separated list, possibly empty, of things changed by the assembly code; these include specific register names, "cc" and "memory".

It is possible to specify part of a register name using an extension of the %[name]; in particular.

  • %b[name] specifies the low byte register name (bits 7..0)

  • %h[name] specifies the high byte register name (bits 15..8)

  • %w[name] specifies the low word register name (bits 15..0)

  • %k[name] specifies the low doubleword register name (bits 31..0)

    • unfortunately, the letter l is used for labels.

  • %q[name] specifies the quadword register name (bits 63..0).

Register usage conventions:

  • rbx, rbp and r12-r15 need to be saved if they are used.

  • return value in rax.

  • first 6 integer arguments in rdi, rsi, rdx, rcx, r8, and r9.

Last updated