Export to GitHub

tenfourfox - issue #78

PIC patches for WebM AltiVec acceleration (just in case)


Posted on Jul 22, 2011 by Massive Rhino

Confirmed by Tobias. Eeek!

Attachments

Comment #1

Posted on Jul 22, 2011 by Happy Rabbit

Actually the accelereated function gets called be the macro "IDCT_INVOKE(rtcd, idct16)" which is called from one place in "vp8/common/invtrans.c".

So there already WAS one call to the accelerated function and one can see it by using "nm" on "invtrans.o".

And replacing the call to the C function in dequantize.c doesn't work well (at least in TFF7); the picture scaling is wrong and there are many colourful glitches. I can't see the reason for these problems; I don't understand much of PPC assembly but the assembly function is intended to do the same as the C version - and in invtrans.c it seems to work (given it is used there for real).

Would be worth a try in TFF6 as there the PIC-compatibility fixes aren't yet necessary.

Comment #2

Posted on Jul 22, 2011 by Happy Rabbit

Discovered that only in the optimized build I get the wrong scaling and the glitches in WebM videos - whether I use the altivec accelerated IDCT or not in "dequantize.c". In the debug build everything is fine. Might be some compiler switches(?); for example mozilla changed from "-mdynamic-no-pic" to "-fPIC" for XUL (but that also affects the debug build).

But using the accelerated IDCT in "dequantize.c" should work for TFF6.

Comment #3

Posted on Aug 13, 2011 by Massive Rhino

Because I've had no means of testing on the G5, I've punted on this one; I'm sorry and feel free to berate me. It's going to be in 7 once I get my network back.

Comment #4

Posted on Sep 4, 2011 by Massive Rhino

Changing disposition and summary

Comment #5

Posted on Sep 5, 2011 by Happy Rabbit

Did you do anything to work around the fact that WebM altivec code isn't PIC compatible?

I recently wanted to try using a different register for loading data in a PIC compatible way. I initially used r14 but that one cannot be used safely. So I tried to use r2 which is available on Mac OS X PPC but not in other PPC ABIs. But after building and an initial run of TFF the hard disk got corrupted by bad physical connection. Now I installed a different main board but still have to try to restore everything with the help of my other PowerBook...

Comment #6

Posted on Sep 5, 2011 by Massive Rhino

Oops

Comment #7

Posted on Sep 5, 2011 by Massive Rhino

Actually, I just used it as is. It builds fine here.

Comment #8

Posted on Sep 5, 2011 by Happy Rabbit

Might be PIC is disabled when building with gcc-4.0.1 ? I didn't see any hints to that but here it didn't build (initially with gcc-4.6 and later on with gcc-4.2

Nevertheless the reworked PIC compatibility patch seems to work well when using register r2 now also in the release build! No more artifacts so far.

Now I'll test again with using accelerated IDCT in dequantize.c . I'm still using my TFF7.0b1 build since rebuilding consumes so much time here...

Comment #9

Posted on Sep 6, 2011 by Happy Rabbit

WebM video working fine here with the patched dequantize.c . I can't see any obvious performance improvements on my G4 1,5 GHz. Might be decoding speed isn't actually the cause that is making WebM videos stutter intermittently here...

So this patch will not appear in TFF7?

Comment #10

Posted on Sep 7, 2011 by Happy Rabbit

Bug 648911 removed the possibility to use "-mdynamic-no-pic" which was the switch that made it possible that position dependant code (code that directly references global symbols) could be linked into the XUL framework - although this is supposed to work only for executables but not for dynamic libraries, but maybe framework bundles are different. I really wonder how your system could link it now that "-mdynamic-no-pic" isn't used any more.

Could you please verify whether "-fPIC" or "-mdynamic-no-pic" (or neither) is used on your system when compiling libvpx? (i.e. by calling "make clean" and "make" in the libvpx object directory)

Comment #11

Posted on Sep 7, 2011 by Massive Rhino

On the build host, -fPIC is defined. It builds without changes.

So I'm a bit confused as to what's going on and the exact problem. If it's a matter of using the VMX IDCT in dequantize, then I think we should wait until we've verified it actually does help (or we should finish writing all the possible IDCT algorithms and maximize the benefit).

If it's a matter of issues with PIC, I'm able to compile fine (at least opt builds) with it here with Xcode 2.5 and your linker. So something else is going on. I also don't follow what you mean in comment 8 unless this was in the changes you mailed to me and I missed it.

Comment #12

Posted on Sep 8, 2011 by Happy Rabbit

Actually that PIC stuff isn't directly related to the IDCT stuff - just because it affects the same code I posted it here.

So now I think that it's the differences between shared libraries and framework bundles that makes it possible linking non-PIC code into XUL. If XUL was a shared library (.dylib) or libvpx itself was built as a shared library that would fail. But it's also fact that now the entire code that goes into XUL is position independant with the exception of the altivec accelerations in libvpx. And although libvpx is explicitely told to generate position independant code (see vpx_config_tenfourfox_altivec.h: "#define CONFIG_PIC 1" ), the altivec accelerations aren't position independant. That fact is the problem I see.

I'll rebuild and see why I got the error XUL when linking the non-PIC libvpx altivec code into XUL. - Nevertheless to me it seems dangerous to ignore the fact that PIC is requested.

Comment #13

Posted on Sep 8, 2011 by Happy Rabbit

An updated version of the dequantize patch. It uses the same macro to invoke the accelerated IDCT as the one that is used in "invtrans.c"

Attachments

Comment #14

Posted on Sep 8, 2011 by Happy Rabbit

Can't reproduce the PIC related linker error I got when initially trying to get TFF7.0a2 built. Might be it has been fixed another way by all the symbol exporting and linking optimizations mozilla is/was doing to that huge XUL. Also found out that XUL is linked as a shared library in fact although there are circumstances (which I don't understand) when it is linked as (part) of a framework bundle. Of what I read when investigating a little was that linking XUL together is rather tricky.

However here comes the patch that makes the libvpx altivec code PIC compatible.

Attachments

Comment #15

Posted on Sep 8, 2011 by Massive Rhino

Do we really need that for every call? Will it not build at all on your system without it? That seems like it would hit our runtime; those m[tf]lrs aren't free.

I don't have a problem with the dequantize patch and I'll put it in the next beta (either 7 if we have to respin it or 8).

Comment #16

Posted on Sep 9, 2011 by Happy Rabbit

By saying "Can't reproduce the PIC related linker error I got when initially trying to get TFF7.0a2 built" I meant that with the current beta sources I don't get the PIC-related linker errors I got with the aurora sources. PIC hits runtime - otherwise all code would be compiled/written to be position independant, I think. I cannot tell whether non-PIC code in a ahared library might cause problems (or performance hits?) in some situations. Might be someone at mozilla or Apple might tell us?!

Comment #17

Posted on Sep 27, 2011 by Massive Rhino

Dequantize patch scheduled for 8

Comment #18

Posted on Oct 4, 2011 by Massive Rhino

Landed

Comment #19

Posted on Oct 8, 2011 by Massive Rhino

Dropping target since dequantize is now in 8. Revising summary.

Comment #20

Posted on Oct 17, 2011 by Massive Rhino

(changing owner)

Comment #21

Posted on Jan 6, 2012 by Massive Rhino

Mozilla upgraded the libvpx to 0.9.7 and broke our loop filter. I'm trying to piece it back together.

Comment #22

Posted on Jan 9, 2012 by Massive Rhino

Should work now, and is now multithreaded, which is a plus.

Comment #23

Posted on Feb 5, 2012 by Massive Rhino

Dropping to low for tracking purposes only, in case we actually do need the PIC work (so far doesn't look like it).

Comment #24

Posted on Feb 5, 2012 by Happy Rabbit

Dependance on PIC might depend on OS version.

Comment #25

Posted on Feb 5, 2012 by Massive Rhino

Do you still get linker errors?

Comment #26

Posted on Feb 5, 2012 by Happy Rabbit

Comment deleted

Comment #27

Posted on Feb 5, 2012 by Happy Rabbit

No. I think I got those errors when I ran "configure" on 10.5 and "build" on 10.4 or something like this.

Comment #28

Posted on May 12, 2012 by Happy Rabbit

I'm just building the first "aurorafox" and PIC is enabled per default when building for 10.5 . So I'll test and debug the PIC patches now.

Comment #29

Posted on May 12, 2012 by Happy Rabbit

What's also missing to be implemented is runtime Altivec detection in libvpx. (And maybe runtime G5 detection in methodjit)

Comment #30

Posted on May 12, 2012 by Happy Rabbit

Typing this in Aurora 14.0a2 running on 10.5.8 PowerPC G4. Altivec accelerated libvpx with PIC patches seems to be working well for the bigbuckbunny movie.

Comment #31

Posted on May 20, 2012 by Helpful Camel

Comment deleted

Comment #32

Posted on Oct 14, 2012 by Helpful Camel

Comment deleted

Comment #33

Posted on Dec 20, 2012 by Massive Rhino

Do we still even want this since we're compiling almost always with -mdynamic-no-pic?

Comment #34

Posted on Dec 20, 2012 by Happy Rabbit

We'd only want it if we needed it.

Comment #35

Posted on Jan 17, 2013 by Happy Rabbit

For the sake of reducing the number of open issues ;-).

Status: WontFix

Labels:
Type-Enhancement Priority-Low