пятница, 25 января 2013 г.

Using shared function from different blocks II

I've decided to use simple way to share a function which presumes that block B uses parameter space P. This can be implemented in the easiest way just by sharing value p. In this scheme block A uses a known parameter space P, so does block B. Block B includes some class which incapsulates calculation with known algorythm which uses parameter p received from block A.

This way is much less interesting and not so "fuzzy", cause it uses some predefined algorythm for creating value p in block A and predefined algorythm of using this value.

But a task with serializing function with dependencies from outer scope is still interesting problem to solve.

среда, 23 января 2013 г.

Using shared function from different blocks I

Now at my work I face an interesting problem: I should provide a solution with a mechanism of generating functions dynamically and to transfer it to some another algorythm. The scheme of the solution isn't clear enough by now, so I don't know how I will create these functions, but here I'd lie to talk about function transfer.
Here is a robust scheme of the problem:

x = incoming data (space X)
y = transformed data (space Y)
[Block] A -----> creation of function F: x -> y
[Block] B -----> applying F: F(x) = y

The decision to break solution into these two blocks comes from the fact that creation of F can involve great amount of data and CPU time. On the contrary, block B should work as fast as possible using current function F for transformation. Actually, block A and block B can even work on different servers. So function F should be serialized somehow and transfered to the block B.

Creation of F involves creation of some value in parameter space and it acts using some parameter p, which is created based on some stored amount of data {x}, so, actually, there exist some function W: {x} -> p (from space P) and F: x, p -> y. The question is whether block B should know about the parameter space or not (In my opinion it shouldn't cause of black box principle).

To be exact I'll describe an example for what I'm trying to do:

x = 1,2,3.....
y = 0,1

The algorythm stores some 10 values of x and then it creates a parameter space, which consist from only one value p=2 (W: {1,2,.....10} -> p=2). Then function F is created in such a way: F x, p -> y = x mod p. If block B doesn't know about the parameter space then F is transformed into F': x -> y = x mod 2 and then trasfered to block B. If it knows about parameter space then one can leave F as it is and transfer it without transformation + transfer value p = 2.

The second question is how to implement this architecture in C# (as it is primary language solution uses). Concerning delegate serialization one of the features of .NET is that Func<...> delegate is serializable, but it is so only if Func doesn't use variables from outer scope. For example:


        private void FuncTest()
        {
            // will serialize OK with binary formatter
            Func<object, object, int> func = (o1, o2) => o1.GetHashCode() | o2.GetHashCode();

            // fails to serialize
            int i = 10;
            Func<object, object, int> func2 = (o1, o2) => (o1.GetHashCode() | o2.GetHashCode()) + i;
        }


Assuming block B knows about the space P, one can consider the following solution: in block A you just accurately create a Func<P,X,Y> or some wrapper around it, insisting that P is serializable. Then one can serialize this function and some p from the space P and then create Func<X,Y> which always uses deserialized value p and deserialized Func<P,X,Y>.

This is what is needed here, but the situation can be more complicated if there are many spaces of P. That's the reason (besides some considerations of beauty and black box) I don't want block B to know about some inner spaces for block A.

If one assumes that P is hidden in the block A then situation becomes more sophisticated. The way to move on is, in my opinion, Expressions and ExpressionVisitors from System.Linq namespace. Actually, I've never used them and even didn't know about them until now. But (Linq impresses!) they seem to be powerful and flexible instruments. The way I'm currently thinking at is to make an expression (note that lambda syntax for Expressions is not equal to that for Delegates, so some conversion from Func to Expression might be needed, I was rather surprised by this) and then (when serializing) to replace all expressions in it with current values of parameters if the expression doesn't use exact parameters for the Func<X,Y>.

The task that seemed an easy thing initially is not finished yet.

пятница, 11 января 2013 г.

A little unobvious error


Here I describe one error I've done which might seem not obvious at first sight. (The code in this post is C#)

The following situation occured at my work recently. Some class was declared approximately as follows:

    public class SomeClass
    {
        private volatile int _x1; 
        private volatile int _x2;


        public int X1
        {
            get { return _x1; }
            set
            {
                System.Threading.Interlocked.Exchange(ref _x1, value);
                CheckForDone();
            }
        }
        public int X2
        {
            get { return _x2; }
            set
            {
                System.Threading.Interlocked.Exchange(ref _x2, value);
                CheckForDone();
            }
        }
        public int X { get; set; }


        public bool IsCompleted { get { return X1 == X && X2 == X; } }


        private void CheckForDone()
        {
            if (IsCompleted)
                RaiseDone(new EventArgs());
        }


        public event EventHandler Done;
        private void RaiseDone(EventArgs args)
        {
            if (Done != null)
                Done(this, args);
        }
    }

Here is the situation: variables _x1 and _x2 are incremented in some unknown order (from different threads) until both of them reach value X. When each variable achieves new value, CheckForDone function checks whether both values have reached X and if they had then some event is fired. Thus last variable which reaches X fires this event.

The problem is that Done event fires twice sometimes (this is undesirable behavior). This often happened with first actuation of Done event - due to jitter lags, I think. The reason is, of course, that IsCompleted and setters of X1 and X2 are not some atomic operations. Then first thread calculates X1 == X (which is, say, true), X2 changes (X2 == X becomes true) and fires another CheckForDone function. In this case IsCompleted returns true in both cases and event fires twice.

There are several ways to solve the problem and I've chosen the simplest one: I added a variable which indicates whether Done event had already fired or not.

        private volatile int _doneFired = 0;
        public bool DoneFired
        {
            get { return _doneFired == 1; }
            set { System.Threading.Interlocked.Exchange(ref _doneFired, value ? 1 : 0); }
        }

Then CheckForDone is modified in following way:

        private void CheckForDone()
        {
            if (IsCompleted && !DoneFired)
            {
                // problem zone
                DoneFired = true;
                RaiseDone(new EventArgs());
            }
        }

It was obvious enough but here are some errors and, surprisingly, the problem occured again.

The error here is that both threads can occur in the "problem zone" before one of them had set DoneFired to true. The solution is to use CompareExchange instead of Exchange method that guarantees that both comparing to true and setting to true is performed as atomic operation. So the code for DoneFired has to be written in the form:

        public bool DoneFired 
        { 
            get { return System.Threading.Interlocked.CompareExchange(ref _doneFired, 1, 0) == 1; } 
        }

        private void CheckForDone()
        {
            if (IsCompleted && !DoneFired)
            {
                RaiseDone(new EventArgs());
            }
        }

Sure, the problem can be solved in another, more obvious way - by introducing a lock on some object that guarantees that DoneFired will check for true and change only once, but lock works significantly slower and this can be crucial in some scenarios.