Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: two-layer ibl hashtable with inner fixed-size #31

Open
derekbruening opened this issue Nov 27, 2014 · 1 comment
Open

perf: two-layer ibl hashtable with inner fixed-size #31

derekbruening opened this issue Nov 27, 2014 · 1 comment

Comments

@derekbruening
Copy link
Contributor

From derek.br...@gmail.com on February 21, 2009 10:52:11

We should try a two-layer scheme for rets: a 256-entry table updated on
every call, with collision chaining at the target.

See Ole's paper: http://engweb.vmware.com/~agesen/wbia2006.pdf summary of scheme:

2 level return lookup hashtable
  1st level fixed size, direct-mapped 256 entries
    no cmp for empty or for collision (cmp at target)
  2nd level full table                               
@ every call prime the first level w/ a store        
  mov after_call_frag_prefx => table_slot            
@ return                                             
  spill eax, ecx                                     
  ret addr -> ecx                                    
  movzx cl -> eax                                    
  jmp *table_base + eax                              
@ after_call_frag_prefx                              
  lea compare ecx to after_call_addr                 
  => miss: spill flags, full ret ibl                 
     hit: continue (common case no eflags and small dcache footprint)

discussion notes:

  • thread-shared full ret ibl table
  • thread-private 256-entry table: but then w/ shared fragments need
    spill + extra instrs on call store?
    stick whole table in TEB
    or use own segment (PR 208009)
    or try shared table: on uniprocessor may work fine
  • no call inlining
  • mark after-call as FRAG_X, propagate to trace if 1st bb there
  • insert collision prefix if fragment starts w/ FRAG_X
  • need to manage hardcoded code cache addr versus cache deletion
    hack to combine w/ linking:
    • put in unreachable jmp to after-call addr, after jmp to callee
    • when unlinked, put table-empty addr there
    • when linked, put code cache addr there
  • selfprot: how allow write to table?
    if through segment and ds has limit below it then could protect

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=31

@derekbruening
Copy link
Contributor Author

From derek.br...@gmail.com on February 21, 2009 07:53:46

this was PR 191978

derekbruening added a commit that referenced this issue Jul 5, 2023
8 of the 13 tests on Mac AArch64 labeled "OSX" fail prior to this PR.
Here we fix the following:

+ Syscall success is indicated by the carry flag just like x86 Mac
+ Handle sigreturn with its extra parameters just like x86 Mac
+ Fix signal handler parameters
+ Fix stolen register support in signal contexts
+ Use MAP_JIT and pthread_jit_write_protect_np for +rwx gencode in tests
+ Use DYLD_LIBRARY_PATH on Mac in tests

Now all 13 tests pass:
---------------------------------------------------------------------------------------
ctest -j 5 -L OSX
 1/13 Test  #13: code_api|common.fib ................................  Passed  0.59 sec
 2/13 Test #243: code_api|libutil.frontend_test .....................  Passed  0.63 sec
 3/13 Test #231: code_api|api.ir ....................................  Passed  0.67 sec
 4/13 Test   #9: code_api|linux.sigaction.native ....................  Passed  0.25 sec
 5/13 Test  #31: code_api|linux.signal0000 ..........................  Passed  0.10 sec
 6/13 Test #240: code_api|api.ir-static .............................  Passed  0.34 sec
 7/13 Test #241: code_api|api.drdecode ..............................  Passed  0.38 sec
 8/13 Test #245: code_api|api.dis-a64 ...............................  Passed  1.15 sec
 9/13 Test #264: no_code_api,no_intercept_all_signals|linux.sigaction  Passed  0.08 sec
10/13 Test  #33: code_api|linux.signal0010 ..........................  Passed  0.34 sec
11/13 Test  #35: code_api|linux.signal0100 ..........................  Passed  0.42 sec
12/13 Test  #37: code_api|linux.signal0110 ..........................  Passed  0.45 sec
13/13 Test   #7: samples_proj .......................................  Passed  1.89 sec
100% tests passed, 0 tests failed out of 13
---------------------------------------------------------------------------------------

Issue: #5383
github-merge-queue bot pushed a commit that referenced this issue Jul 7, 2023
8 of the 13 tests on Mac AArch64 labeled "OSX" fail prior to this PR.
Here we fix the following:

+ Syscall success is indicated by the carry flag just like x86 Mac
+ Handle sigreturn with its extra parameters just like x86 Mac
+ Fix signal handler parameters
+ Fix stolen register support in signal contexts
+ Use MAP_JIT and pthread_jit_write_protect_np for +rwx gencode in tests
+ Use DYLD_LIBRARY_PATH on Mac in tests

Now all 13 tests pass:
```
---------------------------------------------------------------------------------------
$ ctest -j 5 -L OSX
 1/13 Test  #13: code_api|common.fib ................................  Passed  0.59 sec
 2/13 Test #243: code_api|libutil.frontend_test .....................  Passed  0.63 sec
 3/13 Test #231: code_api|api.ir ....................................  Passed  0.67 sec
 4/13 Test   #9: code_api|linux.sigaction.native ....................  Passed  0.25 sec
 5/13 Test  #31: code_api|linux.signal0000 ..........................  Passed  0.10 sec
 6/13 Test #240: code_api|api.ir-static .............................  Passed  0.34 sec
 7/13 Test #241: code_api|api.drdecode ..............................  Passed  0.38 sec
 8/13 Test #245: code_api|api.dis-a64 ...............................  Passed  1.15 sec
 9/13 Test #264: no_code_api,no_intercept_all_signals|linux.sigaction  Passed  0.08 sec
10/13 Test  #33: code_api|linux.signal0010 ..........................  Passed  0.34 sec
11/13 Test  #35: code_api|linux.signal0100 ..........................  Passed  0.42 sec
12/13 Test  #37: code_api|linux.signal0110 ..........................  Passed  0.45 sec
13/13 Test   #7: samples_proj .......................................  Passed  1.89 sec
100% tests passed, 0 tests failed out of 13
---------------------------------------------------------------------------------------
```
Issue: #5383
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant