Comment #1
Posted on Jul 22, 2011 by Happy RabbitActually the accelereated function gets called be the macro "IDCT_INVOKE(rtcd, idct16)" which is called from one place in "vp8/common/invtrans.c".
So there already WAS one call to the accelerated function and one can see it by using "nm" on "invtrans.o".
And replacing the call to the C function in dequantize.c doesn't work well (at least in TFF7); the picture scaling is wrong and there are many colourful glitches. I can't see the reason for these problems; I don't understand much of PPC assembly but the assembly function is intended to do the same as the C version - and in invtrans.c it seems to work (given it is used there for real).
Would be worth a try in TFF6 as there the PIC-compatibility fixes aren't yet necessary.
Comment #2
Posted on Jul 22, 2011 by Happy RabbitDiscovered that only in the optimized build I get the wrong scaling and the glitches in WebM videos - whether I use the altivec accelerated IDCT or not in "dequantize.c". In the debug build everything is fine. Might be some compiler switches(?); for example mozilla changed from "-mdynamic-no-pic" to "-fPIC" for XUL (but that also affects the debug build).
But using the accelerated IDCT in "dequantize.c" should work for TFF6.
Comment #3
Posted on Aug 13, 2011 by Massive RhinoBecause I've had no means of testing on the G5, I've punted on this one; I'm sorry and feel free to berate me. It's going to be in 7 once I get my network back.
Comment #4
Posted on Sep 4, 2011 by Massive RhinoChanging disposition and summary
Comment #5
Posted on Sep 5, 2011 by Happy RabbitDid you do anything to work around the fact that WebM altivec code isn't PIC compatible?
I recently wanted to try using a different register for loading data in a PIC compatible way. I initially used r14 but that one cannot be used safely. So I tried to use r2 which is available on Mac OS X PPC but not in other PPC ABIs. But after building and an initial run of TFF the hard disk got corrupted by bad physical connection. Now I installed a different main board but still have to try to restore everything with the help of my other PowerBook...
Comment #6
Posted on Sep 5, 2011 by Massive RhinoOops
Comment #7
Posted on Sep 5, 2011 by Massive RhinoActually, I just used it as is. It builds fine here.
Comment #8
Posted on Sep 5, 2011 by Happy RabbitMight be PIC is disabled when building with gcc-4.0.1 ? I didn't see any hints to that but here it didn't build (initially with gcc-4.6 and later on with gcc-4.2
Nevertheless the reworked PIC compatibility patch seems to work well when using register r2 now also in the release build! No more artifacts so far.
Now I'll test again with using accelerated IDCT in dequantize.c . I'm still using my TFF7.0b1 build since rebuilding consumes so much time here...
Comment #9
Posted on Sep 6, 2011 by Happy RabbitWebM video working fine here with the patched dequantize.c . I can't see any obvious performance improvements on my G4 1,5 GHz. Might be decoding speed isn't actually the cause that is making WebM videos stutter intermittently here...
So this patch will not appear in TFF7?
Comment #10
Posted on Sep 7, 2011 by Happy RabbitBug 648911 removed the possibility to use "-mdynamic-no-pic" which was the switch that made it possible that position dependant code (code that directly references global symbols) could be linked into the XUL framework - although this is supposed to work only for executables but not for dynamic libraries, but maybe framework bundles are different. I really wonder how your system could link it now that "-mdynamic-no-pic" isn't used any more.
Could you please verify whether "-fPIC" or "-mdynamic-no-pic" (or neither) is used on your system when compiling libvpx? (i.e. by calling "make clean" and "make" in the libvpx object directory)
Comment #11
Posted on Sep 7, 2011 by Massive RhinoOn the build host, -fPIC is defined. It builds without changes.
So I'm a bit confused as to what's going on and the exact problem. If it's a matter of using the VMX IDCT in dequantize, then I think we should wait until we've verified it actually does help (or we should finish writing all the possible IDCT algorithms and maximize the benefit).
If it's a matter of issues with PIC, I'm able to compile fine (at least opt builds) with it here with Xcode 2.5 and your linker. So something else is going on. I also don't follow what you mean in comment 8 unless this was in the changes you mailed to me and I missed it.
Comment #12
Posted on Sep 8, 2011 by Happy RabbitActually that PIC stuff isn't directly related to the IDCT stuff - just because it affects the same code I posted it here.
So now I think that it's the differences between shared libraries and framework bundles that makes it possible linking non-PIC code into XUL. If XUL was a shared library (.dylib) or libvpx itself was built as a shared library that would fail. But it's also fact that now the entire code that goes into XUL is position independant with the exception of the altivec accelerations in libvpx. And although libvpx is explicitely told to generate position independant code (see vpx_config_tenfourfox_altivec.h: "#define CONFIG_PIC 1" ), the altivec accelerations aren't position independant. That fact is the problem I see.
I'll rebuild and see why I got the error XUL when linking the non-PIC libvpx altivec code into XUL. - Nevertheless to me it seems dangerous to ignore the fact that PIC is requested.
Comment #13
Posted on Sep 8, 2011 by Happy RabbitAn updated version of the dequantize patch. It uses the same macro to invoke the accelerated IDCT as the one that is used in "invtrans.c"
Comment #14
Posted on Sep 8, 2011 by Happy RabbitCan't reproduce the PIC related linker error I got when initially trying to get TFF7.0a2 built. Might be it has been fixed another way by all the symbol exporting and linking optimizations mozilla is/was doing to that huge XUL. Also found out that XUL is linked as a shared library in fact although there are circumstances (which I don't understand) when it is linked as (part) of a framework bundle. Of what I read when investigating a little was that linking XUL together is rather tricky.
However here comes the patch that makes the libvpx altivec code PIC compatible.
- libvpx_PIC.patch 32.9KB
Comment #15
Posted on Sep 8, 2011 by Massive RhinoDo we really need that for every call? Will it not build at all on your system without it? That seems like it would hit our runtime; those m[tf]lrs aren't free.
I don't have a problem with the dequantize patch and I'll put it in the next beta (either 7 if we have to respin it or 8).
Comment #16
Posted on Sep 9, 2011 by Happy RabbitBy saying "Can't reproduce the PIC related linker error I got when initially trying to get TFF7.0a2 built" I meant that with the current beta sources I don't get the PIC-related linker errors I got with the aurora sources. PIC hits runtime - otherwise all code would be compiled/written to be position independant, I think. I cannot tell whether non-PIC code in a ahared library might cause problems (or performance hits?) in some situations. Might be someone at mozilla or Apple might tell us?!
Comment #17
Posted on Sep 27, 2011 by Massive RhinoDequantize patch scheduled for 8
Comment #18
Posted on Oct 4, 2011 by Massive RhinoLanded
Comment #19
Posted on Oct 8, 2011 by Massive RhinoDropping target since dequantize is now in 8. Revising summary.
Comment #20
Posted on Oct 17, 2011 by Massive Rhino(changing owner)
Comment #21
Posted on Jan 6, 2012 by Massive RhinoMozilla upgraded the libvpx to 0.9.7 and broke our loop filter. I'm trying to piece it back together.
Comment #22
Posted on Jan 9, 2012 by Massive RhinoShould work now, and is now multithreaded, which is a plus.
Comment #23
Posted on Feb 5, 2012 by Massive RhinoDropping to low for tracking purposes only, in case we actually do need the PIC work (so far doesn't look like it).
Comment #24
Posted on Feb 5, 2012 by Happy RabbitDependance on PIC might depend on OS version.
Comment #25
Posted on Feb 5, 2012 by Massive RhinoDo you still get linker errors?
Comment #26
Posted on Feb 5, 2012 by Happy RabbitComment deleted
Comment #27
Posted on Feb 5, 2012 by Happy RabbitNo. I think I got those errors when I ran "configure" on 10.5 and "build" on 10.4 or something like this.
Comment #28
Posted on May 12, 2012 by Happy RabbitI'm just building the first "aurorafox" and PIC is enabled per default when building for 10.5 . So I'll test and debug the PIC patches now.
Comment #29
Posted on May 12, 2012 by Happy RabbitWhat's also missing to be implemented is runtime Altivec detection in libvpx. (And maybe runtime G5 detection in methodjit)
Comment #30
Posted on May 12, 2012 by Happy RabbitTyping this in Aurora 14.0a2 running on 10.5.8 PowerPC G4. Altivec accelerated libvpx with PIC patches seems to be working well for the bigbuckbunny movie.
Comment #31
Posted on May 20, 2012 by Helpful CamelComment deleted
Comment #32
Posted on Oct 14, 2012 by Helpful CamelComment deleted
Comment #33
Posted on Dec 20, 2012 by Massive RhinoDo we still even want this since we're compiling almost always with -mdynamic-no-pic?
Comment #34
Posted on Dec 20, 2012 by Happy RabbitWe'd only want it if we needed it.
Comment #35
Posted on Jan 17, 2013 by Happy RabbitFor the sake of reducing the number of open issues ;-).
Status: WontFix
Labels:
Type-Enhancement
Priority-Low