Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

comparison and difference on unrelated pointers #269

Closed
ramosian-glider opened this issue Aug 31, 2015 · 10 comments
Closed

comparison and difference on unrelated pointers #269

ramosian-glider opened this issue Aug 31, 2015 · 10 comments

Comments

@ramosian-glider
Copy link
Member

Originally reported on Google Code with ID 269

From Dominique Pellé:
====================================================================
First of all, thank you for the address sanitizer.
This one exciting new tool for c++ developers.

A long time ago, I was using another dynamic
analyzer called Insure++ which was good too
(but not free software) and which found things that
Valgrind could not find. I don't have access to it
anymore. Insure++ also works at compilation time,
like ASAN. I remember a kind of bugs which Insure++
found and that ASAN does not find: comparison and
differences on unrelated pointers. See some simple
examples of such bugs detected by Insure++ here:

http://vasc.ri.cmu.edu/media/manuals/insure.5.x/ref/err_expp.htm
http://vasc.ri.cmu.edu/media/manuals/insure.5.x/ref/err_expf.htm

The obvious question: could ASAN also detect such bugs?

I remember finding a few such bugs with this check. Here
is a patch in Vim which I fixed with such a bug...

ftp://ftp.vim.org/pub/vim/patches/7.0/7.0.144

I don't remember the problem exactly, but I think
Vim compared a pointer in the heap with a another
pointer that could sometimes be set to a constant
empty string "" (so in .text section, or .rodata?)
and so comparing pointers did not make any sense.


I thought afterwards: maybe that'd be more
relevant for UBSAN than for ASAN.
====================================================================

This is a good idea, although the performance worries me. 
We could implement 
  __sanitizer_pointer_ge(void *a, void *b);
  __sanitizer_pointer_gt(void *a, void *b);
  __sanitizer_pointer_le(void *a, void *b);
  __sanitizer_pointer_lt(void *a, void *b);
  __sanitizer_pointer_sub(void *a, void *b);
and instrument the IR instructions like this:

  tail call void @__sanitizer_pointer_lt(i8* %a, i8* %b)
  %cmp = icmp ult i8* %a, %b

We may want to add some extra information from the frontend (e.g. names of variables).
There is also a risk that the operations on pointers will be transformed into operations
on integers by the time IR reaches asan instrumentation.
If we insert the callbacks in the frontend it may inhibit many optimizations downstream,
so an alternative could be to assign a metadata to the appropriate instructions. 

The implementation of these functions could be along these lines:

__sanitizer_pointer_ge(void *a, void *b) {
   uptr flag = common_flags()->detect_unrelated_pointers;
   if (!flag) return;
   if (flag) {   // cheap part
      // This part can be computed almost instantly with asan's allocator
      bool in_small_heap1 = asan_addr_is_in_small_heap(a);  // i.e. heap allocation
of <128K
      bool in_small_heap2 = asan_addr_is_in_small_heap(b);  // i.e. heap allocation
of <128K
      if (in_small_heap1 != in_small_heap2) BARK()
      if (in_small_heap1 && in_small_heap2 && 
         asan_addresses_belong_to_different_allocations(a, b)) BARK()
      // neither 'a' nor 'b' are in the small heap
   }
   if (flag < 2) return;
   // can't check cheaply, do the more expensive check
   // The pointers are either in large heap, stack or globals
   // We can tell which class 'a' and 'b' is with reasonable overhead, 
   // but not as cheap as for the small heap.
   // Then, if the pointers are in the same class, we'll need even more work
   // to tell if the objects are different.
}

The check sounds more like ubsan-ish at first, but I don't see how it can be implemented
w/o controlling the allocator.
We can implement the cheap part in msan and tsan in addition to asan because they share
the allocator, 
but the expensive part will require the tool to know the boundaries of stack- and global
objects, 
which is only available in asan. 

The problem is performance. Even if we hide this under a run-time flag like we do with
stack-use-after-return detection,
the overhead will be several instructions per pointer comparison when the checking
is off.
When the checking is on, the overhead will depend on where the pointers are.
If at least one is in the small heap (<128K, most common case) the overhead may be
reasonable. 
If not, the overhead may be monstrous.

This is worth experimenting anyway.

Reported by konstantin.s.serebryany on 2014-02-26 08:16:46

@ramosian-glider
Copy link
Member Author

Of course, even greater concern is the amount of true positives and whether 
users will be willing to fix them. We'll see :) 

Reported by konstantin.s.serebryany on 2014-02-26 09:29:57

@ramosian-glider
Copy link
Member Author

Probably we can figure out some strictness levels for this feature. E.g. do not report,
report more, report even more, report all.
I do not have ideas what patterns have higher chances of being "false/true positives"
(in the sense of being harmful and users are willing to fix them) for now.

Reported by dvyukov@google.com on 2014-02-26 09:49:25

@ramosian-glider
Copy link
Member Author

When I used that feature with Insure++ (a long time ago), I almost never
saw any false positives. The only false positive that I remember, what
a program that was comparing 2 unrelated pointers both in the stack, to
find out in which direction the stack was growing. But that's a fairly
odd and rare thing to do in the first place.  I think it was in the
only false positive I ever saw was in this function in Vim source code
(in vim/src/os_unix.c):

 696 static int stack_grows_downwards; 
 697
 698 /*
 699  * Find out if the stack grows upwards or downwards.
 700  * "p" points to a variable on the stack of the caller.
 701  */
 702     static void
 703 check_stack_growth(p)
 704     char        *p;
 705 {
 706     int         i;
 707 
 708     stack_grows_downwards = (p > (char *)&i);
 709 }

...

3053     int                 i;
3054 
3055     check_stack_growth((char *)&i);

The comparison at line 708 is undefined as it is true
if the stack is growing in one direction or false if
the stack is growing in the other direction. That's
precisely what that function is checking, so in that
case it was a false positive.

Thinking about it, it should not complain when checking in both
directions as the pointer comparison is then valid.  Example:

int *p;
int a[10];
...
if (p >= &a[0] && p < &a[10])
{
  /* This always OK */
  ...
}

if (p >= &a[0])
{
  /* This has undefined behaviour if p and &a[0] are unrelated pointers */
}

Reported by dominique.pelle on 2014-02-26 10:31:14

@ramosian-glider
Copy link
Member Author

The false positives you describe are equally bad under asan's stack-use-after-return
checker and I am not afraid of them. They are rare, and could be suppressed by 
__attribute__((no_sanitize_address)).

What I am afraid of is *true* positives that users will not want to fix,
e.g. std::set<void *> or some such. 

Reported by konstantin.s.serebryany on 2014-02-26 10:41:50

@ramosian-glider
Copy link
Member Author

In my previous comment, I wrote about the example function check_stack_growth(...):

> The comparison at line 708 is undefined as it is true
> if the stack is growing in one direction or false if
> the stack is growing in the other direction. That's
> precisely what that function is checking, so in that
> case it was a false positive.

Thinking about it, I wonder whether it's a false positive
or not.  What happens if the compiler inlines function
check_stack_growth(...)? Then I'm not sure any more whether
the function will returns the expected result.  So I'm
not sure how portable this function is and warning about
it does not look bad to me.

Reported by dominique.pelle on 2014-02-26 10:42:08

@ramosian-glider
Copy link
Member Author

First kill: http://llvm.org/viewvc/llvm-project?view=revision&revision=202266
This is my own asan test, 
it was putting malloc-ed pointers into a vector and then sorting them 
(on purpose! the test is very asan-specific)

Reported by konstantin.s.serebryany on 2014-02-26 14:02:12

@ramosian-glider
Copy link
Member Author

Filed a related bug for clang/llvm:
http://llvm.org/bugs/show_bug.cgi?id=18989

Reported by konstantin.s.serebryany on 2014-02-27 12:53:59

@ramosian-glider
Copy link
Member Author

Reported by ramosian.glider on 2015-07-30 09:05:32

  • Labels added: ProjectAddressSanitizer

@ramosian-glider
Copy link
Member Author

Adding Project:AddressSanitizer as part of GitHub migration.

Reported by ramosian.glider on 2015-07-30 09:06:56

@ramosian-glider
Copy link
Member Author

Let us track the further activity in http://llvm.org/bugs/show_bug.cgi?id=18989

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant