My favorites | Sign in
Project Home Downloads Wiki Issues Source Code Search
New issue   Search
  Advanced search   Search tips   Subscriptions
Issue 1518: Shorter ia32 deferred code fragments
5 people starred this issue and may be notified of changes. Back to list
Status:  Accepted
Owner:  ----

Sign in to add a comment
Reported by, Jun 30, 2011
Deferred code segments could generally be shorter and straight-line in the common case.

If the Genenerate() methods are changed to take the EXIT label, the deferred code has fewer constraints.

Other than the space saving, I don't see this having much impact on benchmarks.


0xf53b0ffd   285  8179ffa14037f5 cmp [ecx+0xff],0xf53740a1    ;; object: 0xf53740a1 <Map>
0xf53b1004   292  0f85483077ff   jnz 0xf4b24052              ;; deoptimization bailout 5
0xf53b100a   298  f20f104103     movsd xmm0,[ecx+0x3]
0xf53b100f   303  f20f2cc8       cvttsd2si ecx,xmm0
0xf53b1013   307  f20f2ac9       cvtsi2sd xmm1,ecx
0xf53b1017   311  660f2ec1       ucomisd xmm0,xmm1
0xf53b101b   315  0f85313077ff   jnz 0xf4b24052              ;; deoptimization bailout 5
0xf53b1021   321  0f8a2b3077ff   jpe 0xf4b24052              ;; deoptimization bailout 5
0xf53b1027   327  85c9           test ecx,ecx
0xf53b1029   329  0f850d000000   jnz 348  (0xf53b103c)
0xf53b102f   335  660f50c8       movmskpd ecx,xmm0
0xf53b1033   339  83e101         and ecx,0x1
0xf53b1036   342  0f85163077ff   jnz 0xf4b24052              ;; deoptimization bailout 5
0xf53b103c   348  e9e0feffff     jmp 65  (0xf53b0f21)

(68 bytes + 4 relocation records)

Proposed layout:

	cmp [r-1],<Map>
	jnz short bail
	jnz short bail
	jpe short bail
	test r,r
	jnz EXIT
	and r,1
	jz EXIT
	jmp deoptimization_bailout_N

(56 bytes + 1 relocation record)

The above code branches forward in the unexpected case and backwards (to the main code) in the expected case, consistent with:

[IntelĀ® 64 and IA-32 Architectures Optimization Reference Manual Order Number: 248966-024 April 2011]

Assembly/Compiler Coding Rule 3. (M impact, H generality) Arrange code to
be consistent with the static branch prediction algorithm: make the fall-through
code following a conditional branch be the likely target for a branch with a forward
target, and make the fall-through code following a conditional branch be the
unlikely target for a branch with a backward target.

(However, I have read that recent microarchitectures tend to always use dynamic prediction rather than this algorithm.)

Jul 5, 2011
Project Member #1
Saving space is good. Since this is in deferred code and the deopt case is almost never executed, there shouldn't be any performance degradation.
Status: Accepted
Sign in to add a comment

Powered by Google Project Hosting