Microsoft Technologies: 2007

Saturday, September 01, 2007

Design Patterns

Factory Pattern

Factory Methods are usually called within Template Methods. Often, designs start out using Factory Method (less complicated, more customizable, subclasses proliferate) and evolve toward Abstract Factory, Prototype, or Builder (more flexible, more complex) as the designer discovers where more flexibility is needed. Where ever you can implement Factory, you can easily incorporate Abstract Factory and Builder.

Abstract Factory

Abstract Factory classes are often implemented with Factory Methods, but they can be implemented using Prototype. Where ever you can implement Abstract Factory, you can easily incorporate Factory and Builder. The core difference between Factory and Abstract Factory is, Abstract Factory is useful when application demands Family of objects creation, where as Factory is useful at the time of creation of an single instance of a class.

Builder

Builder focuses on constructing a complex object step by step. Abstract Factory emphasizes a family of product objects (either simple or complex). Builder returns the product as a final step, but as far as the Abstract Factory is concerned, the product gets returned immediately. Where ever you can implement Builder, you can also implement factory or abstract factory. This can be implement for Abstract Factory and Factory too.

Prototype

Factory Method: creation through inheritance. Prototype: creation through delegation. In .NET one can easily incorporate prototype using "Icloneabe--> Clone". Prototype is useful when you want to create new object without firing object's constructor. In .NET framework there are two ways to create object without calling constructor (Cloning and De-serialization).

Adapter

Adapter makes things work after they're designed; Bridge makes them work before they are. Bridge is designed up-front to let the abstraction and the implementation vary independently. Adapter is retrofitted to make unrelated classes work together.

Bridge

The structure of State and Bridge are identical (except that Bridge admits hierarchies of envelope classes, whereas State allows only one). The two patterns use the same structure to solve different problems: State allows an object's behavior to change along with its state, while Bridge's intent is to decouple an abstraction from its implementation so that the two can vary independently.

Composite

Three GoF patterns rely on recursive composition: Composite, Decorator, and Chain of Responsibility

Decorator

Decorator is designed to let you add responsibilities to objects without subclassing. Composite's focus is not on embellishment but on representation. These intents are distinct but complementary. Consequently, Composite and Decorator are often used in concert.

Façade

Facade defines a new interface, whereas Adapter uses an old interface. Remember that Adapter makes two existing interfaces work together as opposed to defining an entirely new one.

Flyweight

Flyweight is often combined with Composite to implement shared leaf nodes. Flyweight shows how to make lots of little objects. Facade shows how to make a single object represent an entire subsystem. This diagram is perhaps a better example of Composite than the Composite diagram.

Proxy

Decorator and Proxy have different purposes but similar structures. Both describe how to provide a level of indirection to another object, and the implementations keep a reference to the object to which they forward requests. Adapter provides a different interface to its subject. Proxy provides the same interface. Decorator provides an enhanced interface.

Chain of Resp.

Chain of Responsibility, Command, Mediator, and Observer, address how you can decouple senders and receivers, but with different trade-offs. Chain of Responsibility passes a sender request along a chain of potential receivers.

Command

Command and Memento act as magic tokens to be passed around and invoked at a later time. In Command, the token represents a request; in Memento, it represents the internal state of an object at a particular time. Polymorphism is important to Command, but not to Memento because its interface is so narrow that a memento can only be passed as a value.

Interpreter

Interpreter is really an application of Composite.

Iterator

Memento is often used in conjunction with Iterator. An Iterator can use a Memento to capture the state of an iteration. The Iterator stores the Memento internally.

Mediator

Mediator is similar to Facade in that it abstracts functionality of existing classes. Mediator abstracts/centralizes arbitrary communications between colleague objects. It routinely "adds value", and it is known/referenced by the colleague objects. In contrast, Facade defines a simpler interface to a subsystem, it doesn't add new functionality, and it is not known by the subsystem classes.

Memento

Command can use Memento to maintain the state required for an undo operation.

Observer

Mediator and Observer are competing patterns. The difference between them is that Observer distributes communication by introducing "observer" and "subject" objects, whereas a Mediator object encapsulates the communication between other objects. We've found it easier to make reusable Observers and Subjects than to make reusable Mediators. On the other hand, Mediator can leverage Observer for dynamically registering colleagues and communicating with them.

Strategy

Strategy, State, Bridge (and to some degree Adapter) have similar solution structures. They all share elements of the "handle/body" idiom. They differ in intent - that is, they solve different problems.

Most of the GoF patterns exercise the two levels of indirection demonstrated here.

Promote the "interface" of a method to an abstract base class or interface, and bury the many possible implementation choices in concrete derived classes.

Hide the implementation hierarchy behind a "wrapper" class that can perform responsibilities like: choosing the best implementation, caching, state management, remote access.

Strategy can be considered as a mother of all pattern.

State

Strategy is a bind-once pattern, whereas State is more dynamic.

Template Method

Template Method uses inheritance to vary part of an algorithm. Strategy uses delegation to vary the entire algorithm.

Visitor

The Visitor pattern is the classic technique for recovering lost type information without resorting to dynamic casts.

Friday, May 18, 2007

How to avoid GC-Hole

To avoid GC-Hole one must use GCPROTECT_BEGIN to keep your references up to date.

Here’s how to fix our buggy code fragment.

#include “frames.h”
{
MethodTable *pNT = g_pObjectClass->GetRefTable();

//RIGHT
OBJECTREF a = AllocateObjectMemory(pNT);

GCPROTECT_BEGIN(a);
OBJECTREF b = AllocateObjectMemory(pNT);

DoSomething (a, b);

GCPROTECT_END();

}

Notice the addition of the line GCPROTECT_BEGIN(a). GCPROTECT_BEGIN is a macro whose argument is any OBJECTREF-typed storage location (it has to be an expression that can you can legally apply the address-of (&) operator to.) GCPROTECT_BEGIN tells the GC two things:

1. The GC is not to discard any object referred to by the reference stored in local “a”.
2. If the GC moves the object referred to by “a”, it must update “a” to point to the new location.

Now, if the second AllocateObjectMemory() triggers a GC, the “a” object will still be around afterwards, and the local variable “a” will still point to it. “a” may not contain the same address as before, but it will point to the same object. Hence, DoSomething() receives the correct data.

Note that we didn’t similarly protect ‘b” because the caller has no use for “b” after DoSomething() returns. Furthermore, there’s no point in keeping “b” updated because DoSomething() receives a copy of the reference (don’t confuse with “copy of the object.”), not the reference itself. If DoSomething() internally causes GC as well, it is DoSomething()’s responsibility to protect its own copies of “a” and “b”.

Having said that, no one should complain if you play it safe and GCPROTECT “b” as well. You never know when someone might add code later that makes the protection necessary.

Every GCPROTECT_BEGIN must have a matching GCPROTECT_END, which terminates “a”’s protected status. As an additional safeguard, GCPROTECT_END overwrites “a” with garbage so that any attempt to use “a” afterward will fault. GCPROTECT_BEGIN introduces a new C scoping level that GCPROTECT_END closes, so if you use one without the other, you’ll probably experience severe build errors.

Don’t do nonlocal returns from within GCPROTECT blocks.

Never do a “return”, “goto” or other non-local return from between a GCPROTECT_BEGIN/END pair. This will leave the thread’s frame chain corrupted.

One exception: it is explicitly allowed to leave a GCPROTECT block by throwing a managed exception (usually via the COMPlusThrow() function.) The exception subsystem knows about GCPROTECT and correctly fixes up the frame chain as it unwinds.

Monday, April 09, 2007

What is GC-Hole? And How to create GC-Hole?

The term “GC-hole” points to a special class of bugs that bedevils the CLR. The GC hole is a pernicious bug because it is easy to introduce by accident, repros rarely and is very tedious to debug. A single GC-hole can suck up weeks of dev and test time.

Lets discuss, What is GC-Hole? And How to create it?

First, some background. One of the major features of the CLR is the Garbage Collection system. That means that allocated objects, as seen by a managed application, are never freed explicitly by the programmer. Instead, the CLR periodically runs a Garbage Collector (GC). The GC discards objects that are no longer in use. Also, the GC compacts the heap to avoid unused holes in memory. Therefore, a managed object does not have a fixed address. Objects move around according to the whims of the garbage collector.

To do its job, the GC must be told about every reference to every GC object. The GC must know about every stack location, every register and every non-GC data structure that holds a pointer to a GC object. These external pointers are called “root references.”

Armed with this information, the GC can find all objects directly referenced from outside the GC heap. These objects may in turn, reference other objects – which in turn reference other objects and so on. By following these references, the GC finds all reachable (“live”) objects. All other objects are, by definition, unreachable and therefore discarded. After that, the GC may move the surviving objects to reduce memory fragmentation. If it does this, it must, of course, update all existing references to the moved object.

Any time a new object is allocated, a GC may occur. GC can also be explicitly requested by calling the GarbageCollect function directly. GC’s do not happen asynchronously outside these events but since other running threads can trigger GC’s, your thread must act as if GC’s are aynchronous unless you take specific steps to synchronize with the GC. More on that later.

So now, we can define a GC hole. A GC hole occurs when code inside the CLR creates a reference to a GC object, neglects to tell the GC about that reference, performs some operation that directly or indirectly triggers a GC, then tries to use the original reference. At this point, the reference points to garbage memory and the CLR will either read out a wrong value or corrupt whatever that reference is pointing to.

The code snippet below is the simplest way to introduce a GC hole into the system.

//OBJECTREF is a typedef for Object*.

{
PointerTable *pTBL = o_pObjectClass->GetPointerTable();

OBJECTREF aObj = AllocateObjectMemory(pTBL);
OBJECTREF bObj = AllocateObjectMemory(pTBL);

//WRONG!!! “aObj” may point to garbage if the second
//“AllocateObjectMemory” triggered a GC.
DoSomething (aOb, bObj);
}

All it does is allocate two managed objects, and then does something with them both.

This code compiles fine, and if you run simple pre-checkin tests, it will probably “work.” But this code will crash eventually.

Why? If the second call to “AllocateObjectMemory” triggers a GC, that GC discards the object instance you just assigned to “aObj”. This code, like all C++ code inside the CLR, is compiled by a non-managed compiler and the GC cannot know that “aObj” holds a root reference to an object you want kept live.

This point is worth repeating. The GC has no intrinsic knowledge of root references stored in local variables or non-GC data structures maintained by the CLR itself. You must explicitly tell the GC about them.

Friday, March 30, 2007

Small but interesting thing about .NET Framework 2.0

We always talk about the .NET Framework 2.0 the pros and cons, the scope, patterns, best practice and some of the basic class lib. But are we aware about how many types .NET Framework 2.0 have? And like this only some other basic information? I am sure we are not. So here are the small information I want to share with you.

Microsoft .NET Framework 2.0 includes 51 assemblies.
Microsoft .NET Framework 2.0 is having 18,619 Types.
Microsoft .NET Framework 2.0 is having 12,909 Classes.
Microsoft .NET Framework 2.0 is having 4,01,758 public methods.
Microsoft .NET Framework 2.0 is having 93,105 public properties.
Microsoft .NET Framework 2.0 is having 30,546 public events.

How many of events do we know or used?
How many of types we have used in our n-tire career .NET Dev. Career?
How many methods we used so far?
No question about public events.

A research says if a person dedicatedly work on .NET 2.0 to use all of the above, he/she might take more then 1 man years to complete this task. And .NET 4 is about to release in next Jan [this is my assumption, nothing declare till now about .net 4.0] or so. So guys start exploring this version more and more.

Pages