My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
AddressSanitizerAlgorithm  
AddressSanitizer algorithm
Updated Apr 9, 2012 by ramosian...@gmail.com

Short version

The run-time library replaces the malloc and free functions. The memory around malloc-ed regions (red zones) is poisoned. The free-ed memory is placed in quarantine and also poisoned. Every memory access in the program is transformed by the compiler in the following way:

Before:

*address = ...;  // or: ... = *address;

After:

if (IsPoisoned(address)) {
  ReportError(address, kAccessSize, kIsWrite);
}
*address = ...;  // or: ... = *address;

The tricky part is how to implement IsPoisoned very fast and ReportError very compact. Also, instrumenting some of the accesses may be proven redundant.

Memory mapping and Instrumentation

The virtual address space is divided into 2 disjoint classes:

  • Main application memory (Mem): this memory is used by the regular application code.
  • Shadow memory (Shadow): this memory contains the shadow values (or metadata).
There is a correspondence between the shadow and the main application memory. Poisoning a byte in the main memory means writing some special value into the corresponding shadow memory.

These 2 classes of memory should be organized in such a way that computing the shadow memory (MemToShadow) is fast.

The instrumentation performed by the compiler:

shadow_address = MemToShadow(address);
if (ShadowIsPoisoned(shadow_address)) {
  ReportError(address, kAccessSize, kIsWrite);
}

Mapping

AddressSanitizer maps 8 bytes of the application memory into 1 byte of the shadow memory.

There are only 9 different values for any aligned 8 bytes of the application memory:

  • All 8 bytes in qword are unpoisoned (i.e. addressible). The shadow value is 0.
  • All 8 bytes in qword are poisoned (i.e. not addressible). The shadow value is -1.
  • First k bytes are unpoisoned, the rest 8-k are poisoned. The shadow value is k.
This is guaranteed by the fact that malloc returns 8-byte aligned chunks of memory. The only case where different bytes of an aligned qword have different state is the tail of a malloc-ed region. For example, if we call malloc(13), we will have one full unpoisoned qword and one qword where 5 first bytes are unpoisoned.

The instrumentation looks like this:

byte *shadow_address = MemToShadow(address);
byte shadow_value = *shadow_address;
if (shadow_value) {
  if (SlowPathCheck(shadow_value, address, kAccessSize)) {
    ReportError(address, kAccessSize, kIsWrite);
  }
}
// Check the cases where we access first k bytes of the qword
// and these k bytes are unpoisoned.
bool SlowPathCheck(shadow_value, address, kAccessSize) {
  last_accessed_byte = (address & 7) + kAccessSize - 1;
  return last_accessed_byte >= shadow_value);
}

MemToShadow(ShadowAddr) falls into the ShadowGap region which is unaddressible. So, if the program tries to directly access a memory location in the shadow region, it will crash.

64-bit

Shadow = (Mem >> 3) | 0x0000100000000000;

[0x0000200000000000, 0x00007fffffffffff] HighMem
[0x0000140000000000, 0x00001fffffffffff] HighShadow
[0x0000120000000000, 0x000013ffffffffff] ShadowGap
[0x0000100000000000, 0x000011ffffffffff] LowShadow
[0x0000000000000000, 0x00000fffffffffff] LowMem

32 bit

Shadow = (Mem >> 3) | 0x20000000;

[0x40000000, 0xffffffff] HighMem
[0x28000000, 0x3fffffff] HighShadow
[0x24000000, 0x27ffffff] ShadowGap
[0x20000000, 0x23ffffff] LowShadow
[0x00000000, 0x1fffffff] LowMem

Ultra compact shadow

It is possible to use even more compact shadow memory, e.g.

Shadow = (Mem >> 7) | kOffset;

Experiments are in flight.

Report Error

The ReportError could be implemented as a call (this is the default now), but there are some other, slightly more efficient and/or more compact solutions. At some point the default behaviour was:

  • copy the failure address to %rax (%eax).
  • execute ud2 (generates SIGILL)
  • Encode access type and size in a one-byte instruction which follows ud2.
Overal these 3 instructions require 5-6 bytes of machine code.

It is possible to use just a single instruction (e.g. ud2), but this will require to have a full disassembler in the run-time library (or some other hacks).

Stack

In order to catch stack buffer overflow, AddressSanitizer instruments the code like this:

Original code:

void foo() {
  char a[8];
  ...
  return;
}

Instrumented code:

void foo() {
  char redzone1[32];  // 32-byte aligned
  char a[8];          // 32-byte aligned
  char redzone2[24];
  char redzone3[32];  // 32-byte aligned
  int  *shadow_base = MemToShadow(redzone1);
  shadow_base[0] = 0xffffffff;  // poison redzone1
  shadow_base[1] = 0xffffff00;  // poison redzone2, unpoison 'a'
  shadow_base[2] = 0xffffffff;  // poison redzone3
  ...
  shadow_base[0] = shadow_base[1] = shadow_base[2] = 0; // unpoison all
  return;
}

Examples of instrumented code

void write64(long long *a) {
  *a = 0;
}
void write16(short *a) {
  *a = 0;
}
0000000000402460 <write64>:
  402460:	push   %rbp           << stack frame
  402461:	mov    %rsp,%rbp    
  402464:	mov    %rdi,%rax
  402467:	shr    $0x3,%rax      << shift by 3
  40246b:	mov    $0x100000000000,%rcx
  402472:
  402475:	or     %rax,%rcx      << add offset
  402478:	cmpb   $0x0,(%rcx)    << compare shadow with 0
  40247b:	jne    402486 <write64+0x26>
  40247d:	movq   $0x0,(%rdi)    << original store
  402484:	pop    %rbp           
  402485:	retq   
  402486:	callq  405e10 <__asan_report_store8>  << report error
  40248b:	nopl   0x0(%rax,%rax,1)

0000000000402490 <write16>:
  402490:	push   %rbp
  402491:	mov    %rsp,%rbp
  402494:	mov    %rdi,%rax
  402497:	shr    $0x3,%rax
  40249b:	mov    $0x100000000000,%rcx
  4024a2:
  4024a5:	or     %rax,%rcx
  4024a8:	mov    (%rcx),%al
  4024aa:	test   %al,%al
  4024ac:	je     4024b9 <write16+0x29>
  4024ae:	mov    %edi,%ecx
  4024b0:	and    $0x7,%ecx
  4024b3:	inc    %ecx
  4024b5:	cmp    %al,%cl
  4024b7:	jge    4024c0 <write16+0x30>
  4024b9:	movw   $0x0,(%rdi)
  4024be:	pop    %rbp
  4024bf:	retq   
  4024c0:	callq  405e70 <__asan_report_store2>
  4024c5:	nopw   %cs:0x0(%rax,%rax,1)

Unaligned accesses

The current compact mapping will not catch unaligned partially out-of-bound accesses:

int *x = new int[2]; // 8 bytes: [0,7].
int *u = (int*)((char*)x + 6);
*u = 1;  // Access to range [6-9]

Possible solutions:

  • Use byte-to-byte mapping (cons: slower, uses more memory, seems to be available only on 64-bit arch)
  • Add a run-time check to the instrumentation code: if (addr & 7) slow_path_for_unaligned();
  • Ignore
  • Use even more compact mapping, e.g. Shadow=Mem>>7+offset (pros: even less memory; cons: slower code). This will catch some, but not all cases.

Run-time library

Malloc

The run-time library replaces malloc/free and provides error reporting functions like __asan_report_load8.

malloc allocates the requested amount of memory with redzones around it. The shadow values corresponding to the redzones are poisoned and the shadow values for the main memory region are cleared.

free poisons shadow values for the entire region and puts the chunk of memory into a quarantine queue (such that this chunk will not be returned again by malloc during some period of time).

Comments?

Send comments to address-sanitizer@googlegroups.com

Powered by Google Project Hosting