Export to GitHub

fluorinefx - issue #11

ServiceInvoker is occasionally passing in the wrong service class

Posted on Apr 2, 2010 by Happy Bear

We currently operate in a highly concurrent (~10 req/sec) environment and occassionally see 'Could not find a suitable method with name...' errors on methods with valid methods, valid parameters, valid service classes, ect.
The errors are quite random, but with the addition of some new logging code, I can see that MethodHandler.GetMethod () is passing in the wrong service type, and hence cannot find the method.

Comment #1

Posted on Jun 13, 2010 by Helpful Monkey

We're experiencing this issue as well. The problem is a race condition in FluorineFx/Messaging/Endpoints/Filter/ProcessFilter.cs, lines 152-156. The FactoryInstance that is returned from destination.GetFactoryInstance() on line 152 is shared between threads, and what often happens is that factoryInstance.Source is overwritten by another thread by the time factoryInstance.Lookup() is called! Then, the wrong class is used to find the remote method on, and the method cannot be found. It would be great if someone could fix this -- we don't know nearly enough about the internals of FlourineFx to feel safe implementing a fix ourselves, and we don't want to take the performance hit of just throwing a global lock around everything, which is definitely not necessary.

Comment #2

Posted on Oct 29, 2010 by Swift Kangaroo

We experienced this as well. The solution for us was to define multiple destinations in the services-config.xml as opposed to one "fluorine" destination with multiple sources. Fluorine by default expects your custom services-config.xml to be in WEB-CONFIG/flex, You can of course change this in the source.

Comment #3

Posted on Nov 11, 2010 by Massive Wombat

I'm experimenting the same problem on a production server. Did you come up with an easy way to replicate this problem on a local environment?

Comment #4

Posted on Nov 11, 2010 by Happy Cat

Couldn't reproduce locally. Going to try specific source/dest between config and flex that Damien recommended. Won't know it works until we stop getting 100's of errors/day when we push live. :)

Comment #5

Posted on Nov 13, 2010 by Massive Wombat

Yes, I'm exactly in the same situation. In order to reproduce it locally, I'm trying to call my gateway from a C# console application. I'm using NetConnection and passing it "http://localhost:59632/ws/Gatewayx.aspx", is that the good way to go? I haven't got it working yet.

Comment #6

Posted on Nov 13, 2010 by Happy Bear

I assume that you need to pass in an AMF encoded message from .net. You'll also need to initiate multiple calls at once and those calls should be on different services.

Comment #7

Posted on Nov 17, 2010 by Massive Wombat

Yes, you're right. I'm trying to make it work from a C# application. If that doesn't result I still can try it from a flash app, although it is not multi-threaded. I will keep you updated.

Comment #8

Posted on Nov 17, 2010 by Happy Bear

Catch the sample AMF output with Service Capture, or similar. You should then be able to simulate a request from flex by embedding your snapshot AMF packet.

Comment #9

Posted on Nov 17, 2010 by Helpful Monkey

Here's a patch that seems to fix the problem. It just adds a few locks. Note: It might be out of date, the diff paths are wrong, and it includes a bunch of unnecessary whitespace changes made by Visual Studio :P.


Comment #10

Posted on Nov 18, 2010 by Grumpy Giraffe

@ zieDaniel1 Trying your fix now... Will try to remember to update if this works for me :)

Comment #11

Posted on Nov 22, 2010 by Massive Wombat

Thanks @zieDaniel1! Also trying the fix on a local setup.

Comment #12

Posted on Nov 22, 2010 by Massive Wombat

I uploaded the .NET client I'm using to test out FluorineFx with and without zieDaniel1's patch: https://github.com/jdecuyper/FluorineFxNetClient

Comment #13

Posted on Jan 4, 2011 by Massive Wombat

With the help of a small C# Fluorine client, I found out that the best way to generate the error is simply by putting the current thread to sleep when a specific type is fired, inside ProcessFilter.cs:

factoryInstance.Source = amfBody.TypeName; if (amfBody.TypeName == "aSpecialType") Thread.Sleep(500); if (FluorineContext.Current.ActivationMode != null)//query string can override the activation mode factoryInstance.Scope = FluorineContext.Current.ActivationMode; instance = factoryInstance.Lookup();

Since the current thread sleeps for half a second, it give enough time for another thread to override the Source value and cause the "Could not find a suitable method with name ..." error. When adding the lock proposed by zieDaniel1, error doesn't show up any more.

I have been trying to benchmark both DLL (with/without lock) locally and on a remote server but was not able to find a relevant discrepancy between both. Basically I have a big loop which fires one by one 100 methods and store the time elapsed between the petition and the response. Results vary between 0 and 25 milliseconds. The average time with lock is 78 milliseconds and without lock 75 milliseconds. What other kind of benchmark would you guys think of?

I uploaded a small graph to visualize the benchmark: http://jdecuyper.com/fx/index.html

Let me know your thoughts, thanks!

Comment #14

Posted on May 31, 2012 by Helpful Giraffe

A great fix for this for .NET 4.0 is to utilize the ThreadLocal class. The problem with this whole defect is in the /FluorineFx/Messaging/Destination.cs where the "GetFactoryInstance" method returns a singleton factory. If you made the _factoryInstance variable a ThreadLocal instead of a FactoryInstance then the FactoryInstance would be created for each thread and wouldn't have the contention discussed.

The only problem is with application scoped destinations. (i.e. a destination with a of set to 'application') In that case, the instances will actually be thread scoped, not application scoped. This could be fixed by modifying the DotNetFactory.cs 'Lookup' to retrieve the application scoped instance from the FluorineContext.Current.ApplicationState

Status: New

Type-Defect Priority-Medium