Pages

Thursday, December 22, 2005

String Vs StringBuilder : Kaushal Patel

Introduction

This is my second article regarding performance. Most of the people use string everywhere in their code. Actually when doing string concatenation, do you know what exactly your doing? It has a big drawback mainly in concatenation which can be overcome by StringBuilder. It will give vast improvement in performance when you use concatenation of string over String.

What is the exact difference?

First we will look at what happens when you concatenate two strings. For a rough idea, think like this. In a loop you are adding few numbers to get a string to give all the numbers.

string returnNumber = "";
for(int i = 0; i<1000;i++)
{
returnNumber = returnNumber + i.ToString();
}
Here we are defining a string called returnNumber and after that in the loop we are concatenating the old one with the new to get a string. Do you know when we do like that we are assigning it again and again? I mean its really like assigning 999 new strings! Actually the concatenation will create a new string returnNumber, with both old returnNumber and i.ToString(). If we think roughly, How will be the performance of the code? Can you imagine it? No one things about this when coding. If we can have something which are be defined only once and add all the strings into it, what can you say about the performance. That's what StringBuilder is doing.
StringBuilder returnNumber = new StringBuilder(10000);
for(int i = 0; i<1000;i++)
{
returnNumber.Append(i.ToString());
}
We are creating a StringBuilder of length 10000 in memory where we can add all the strings. Which surely wont create a new string each and every time. Actually we are creating a StringBinder, where whenever something added it will get copied in to that memory area. At the end we can get the string by StringBuilder.ToString(). Here also it wont create a new string. It will return a string instance that will point to the string inside the StringBuilder. See, How efficient this is? To explain this with some practical I'm not going to analyze IL code or Optimized JIT compiled code. You can see the different by running the samples.
Why string? Can't use StringBinder everywhere?
No. You can't. When initializing a StringBuilder you are going down in performance. Also many actions that you do with string can't be done with StringBinder. Actually it is used mostly for situations as explained above. Last week I show a person, who used StringBuilder to just add two strings together! its really nonsense. We must really think about the overhead of initialization. In my personal experience a StringBuilder can be used where more than four or more string concatenation take place. Also if you try to do some other manipulation (Like removing a part from the string, replacing a part in the string, etc, etc) then better not to use StringBuilder at those places. Because anyway we are creating new strings. Another important issue. We must be careful to guess the size of StringBuilder . If the size which we are going to get is more than what assigned, it must increase the size. Which will reduce the performance of it.

Monday, November 07, 2005

My Exams Score Board @ BrainBench



1. .NET Framework

2. OO Design Patterns








































































































































































kaushal patel
Test:.NET Framework
Date:
08-Nov-2005
Score:
3.11
Correct/Total:
(25/40)

Topics/Subtopics
Correct/Total
.NET Programming
(5/7)
Features/Benefits
(1/1)
Attribute Programming
(2/3)
Security
(1/2)
COM+ Services
(1/1)
Common Language
Runtime (CLR)
(3/8)
Virtual Execution Engine (VEE)
(1/1)
Features/Benefits
(1/3)
Compilers and Tools
(0/1)
Runtime Hosts
(0/2)
Automatic Memory Management (Garbage Collection)
(1/1)
Assemblies
(3/7)
.NET Portable Executable (PE) File
(1/3)
Assembly Types
(0/1)
Versioning
(1/2)
Manifests/MSIL
(1/1)
.NET Architectural
Overview
(5/6)
Features/Benefits
(2/2)
Common Language Specification
(1/1)
Application Domains
(1/1)
Deployment Strategies
(0/1)
.NET Utilities/Tools
(1/1)
Common Type System
(CTS)
(4/5)
Reference Types
(2/2)
Value Types
(1/1)
Tables and Heaps
(1/1)
Common Language Specification
(0/1)
.NET Metadata
(1/3)
Benefits/Features
(1/2)
Inspecting and Emitting
(0/1)
.NET Framework Class
Library
(4/4)
Features/Benefits
(2/2)
Data Classes
(2/2)

























































































































kaushal patel
Test:OO Design Patterns
Date:
08-Nov-2005
Score:
2.93
Correct/Total:
(21/40)

Topics/Subtopics
Correct/Total
Language Specific
Patterns
(2/10)
Java
(2/8)
C++
(0/2)
GoF Patterns
(14/22)
Behavioral
(4/8)
Structural
(8/9)
Creational
(2/5)
Persistence Patterns
(1/2)
Serialization
(1/1)
Object/Relational Mapping
(0/1)
Fundamental
Architectural Patterns
(2/3)
MVC
(1/2)
Layering
(1/1)
Distributed Patterns
(2/3)
Design
(1/1)
Architectural
(1/2)


Wednesday, October 19, 2005

Garbage Collection in .NET : Kaushal

Implementing proper resource management for your applications can be a difficult, tedious task. It can distract your concentration from the real problems that you're trying to solve. Wouldn't it be wonderful if some mechanism existed that simplified the mind-numbing task of memory management for the developer? Fortunately, in .NET there is: garbage collection (GC).
Let's back up a minute. Every program uses resources of one sort or another—memory buffers, screen space, network connections, database resources, and so on. In fact, in an object-oriented environment, every type identifies some resource available for your program's use. To use any of these resources requires that memory be allocated to represent the type. The steps required to access a resource are as follows:

1. Allocate memory for the type that represents the resource.

2. Initialize the memory to set the initial state of the resource and to make the resource usable.

3. Use the resource by accessing the instance members of the type (repeat as necessary).

4. Tear down the state of the resource to clean up.

5. Free the memory.

This seemingly simple paradigm has been one of the major sources of programming errors. After all, how many times have you forgotten to free memory when it is no longer needed or attempted to use memory after you've already freed it?
These two bugs are worse than most other application bugs because what the consequences will be and when those consequences will occur are typically unpredictable. For other bugs, when you see your application misbehaving, you just fix it. But these two bugs cause resource leaks (memory consumption) and object corruption (destabilization), making your application perform in unpredictable ways at unpredictable times. In fact, there are many tools (such as the Task Manager, the System Monitor ActiveX® Control, CompuWare's BoundsChecker, and Rational's Purify) that are specifically designed to help developers locate these types of bugs.
As I examine GC, you'll notice that it completely absolves the developer from tracking memory usage and knowing when to free memory. However, the garbage collector doesn't know anything about the resource represented by the type in memory. This means that a garbage collector can't know how to perform step four—tearing down the state of a resource. To get a resource to clean up properly, the developer must write code that knows how to properly clean up a resource. In the .NET Framework, the developer writes this code in a Close, Dispose, or Finalize method, which I'll describe later. However, as you'll see later, the garbage collector can determine when to call this method automatically.
Also, many types represent resources that do not require any cleanup. For example, a Rectangle resource can be completely cleaned up simply by destroying the left, right, width, and height fields maintained in the type's memory. On the other hand, a type that represents a file resource or a network connection resource will require the execution of some explicit clean up code when the resource is to be destroyed. I will explain how to accomplish all of this properly. For now, let's examine how memory is allocated and how resources are initialized.

Resource Allocation

The Microsoft® .NET common language runtime requires that all resources be allocated from the managed heap. This is similar to a C-runtime heap except that you never free objects from the managed heap—objects are automatically freed when they are no longer needed by the application. This, of course, raises the question: how does the managed heap know when an object is no longer in use by the application? I will address this question shortly.
There are several GC algorithms in use today. Each algorithm is fine-tuned for a particular environment in order to provide the best performance. This article concentrates on the GC algorithm that is used by the common language runtime. Let's start with the basic concepts.
When a process is initialized, the runtime reserves a contiguous region of address space that initially has no storage allocated for it. This address space region is the managed heap. The heap also maintains a pointer, which I'll call the NextObjPtr. This pointer indicates where the next object is to be allocated within the heap. Initially, the NextObjPtr is set to the base address of the reserved address space region.
An application creates an object using the new operator. This operator first makes sure that the bytes required by the new object fit in the reserved region (committing storage if necessary). If the object fits, then NextObjPtr points to the object in the heap, this object's constructor is called, and the new operator returns the address of the object.

At this point, NextObjPtr is incremented past the object so that it points to where the next object will be placed in the heap. Figure 1 shows a managed heap consisting of three objects: A, B, and C. The next object to be allocated will be placed where NextObjPtr points (immediately after object C).
Now let's look at how the C-runtime heap allocates memory. In a C-runtime heap, allocating memory for an object requires walking though a linked list of data structures. Once a large enough block is found, that block has to be split, and pointers in the linked list nodes must be modified to keep everything intact. For the managed heap, allocating an object simply means adding a value to a pointer—this is blazingly fast by comparison. In fact, allocating an object from the managed heap is nearly as fast as allocating memory from a thread's stack!
So far, it sounds like the managed heap is far superior to the C-runtime heap due to its speed and simplicity of implementation. Of course, the managed heap gains these advantages because it makes one really big assumption: address space and storage are infinite. This assumption is (without a doubt) ridiculous, and there must be a mechanism employed by the managed heap that allows the heap to make this assumption. This mechanism is called the garbage collector. Let's see how it works.
When an application calls the new operator to create an object, there may not be enough address space left in the region to allocate to the object. The heap detects this by adding the size of the new object to NextObjPtr. If NextObjPtr is beyond the end of the address space region, then the heap is full and a collection must be performed.
In reality, a collection occurs when generation 0 is completely full. Briefly, a generation is a mechanism implemented by the garbage collector in order to improve performance. The idea is that newly created objects are part of a young generation, and objects created early in the application's lifecycle are in an old generation. Separating objects into generations can allow the garbage collector to collect specific generations instead of collecting all objects in the managed heap. Generations will be discussed in more detail in Part 2 of this article.

The Garbage Collection Algorithm
The garbage collector checks to see if there are any objects in the heap that are no longer being used by the application. If such objects exist, then the memory used by these objects can be reclaimed. (If no more memory is available for the heap, then the new operator throws an OutOfMemoryException.) How does the garbage collector know if the application is using an object or not? As you might imagine, this isn't a simple question to answer.
Every application has a set of roots. Roots identify storage locations, which refer to objects on the managed heap or to objects that are set to null. For example, all the global and static object pointers in an application are considered part of the application's roots. In addition, any local variable/parameter object pointers on a thread's stack are considered part of the application's roots. Finally, any CPU registers containing pointers to objects in the managed heap are also considered part of the application's roots. The list of active roots is maintained by the just-in-time (JIT) compiler and common language runtime, and is made accessible to the garbage collector's algorithm.
When the garbage collector starts running, it makes the assumption that all objects in the heap are garbage. In other words, it assumes that none of the application's roots refer to any objects in the heap. Now, the garbage collector starts walking the roots and building a graph of all objects reachable from the roots. For example, the garbage collector may locate a global variable that points to an object in the heap.
Figure 2 shows a heap with several allocated objects where the application's roots refer directly to objects A, C, D, and F. All of these objects become part of the graph. When adding object D, the collector notices that this object refers to object H, and object H is also added to the graph. The collector continues to walk through all reachable objects recursively.

Once this part of the graph is complete, the garbage collector checks the next root and walks the objects again. As the garbage collector walks from object to object, if it attempts to add an object to the graph that it previously added, then the garbage collector can stop walking down that path. This serves two purposes. First, it helps performance significantly since it doesn't walk through a set of objects more than once. Second, it prevents infinite loops should you have any circular linked lists of objects.
Once all the roots have been checked, the garbage collector's graph contains the set of all objects that are somehow reachable from the application's roots; any objects that are not in the graph are not accessible by the application, and are therefore considered garbage. The garbage collector now walks through the heap linearly, looking for contiguous blocks of garbage objects (now considered free space). The garbage collector then shifts the non-garbage objects down in memory (using the standard memcpy function that you've known for years), removing all of the gaps in the heap. Of course, moving the objects in memory invalidates all pointers to the objects. So the garbage collector must modify the application's roots so that the pointers point to the objects' new locations. In addition, if any object contains a pointer to another object, the garbage collector is responsible for correcting these pointers as well. Figure 3 shows the managed heap after a collection.

After all the garbage has been identified, all the non-garbage has been compacted, and all the non-garbage pointers have been fixed-up, the NextObjPtr is positioned just after the last non-garbage object. At this point, the new operation is tried again and the resource requested by the application is successfully created.
As you can see, a GC generates a significant performance hit, and this is the major downside of using a managed heap. However, keep in mind that GCs only occur when the heap is full and, until then, the managed heap is significantly faster than a C-runtime heap. The runtime's garbage collector also offers some optimizations that greatly improve the performance of garbage collection. I'll discuss these optimizations in Part 2 of this article when I talk about generations.
There are a few important things to note at this point. You no longer have to implement any code that manages the lifetime of any resources that your application uses. And notice how the two bugs I discussed at the beginning of this article no longer exist. First, it is not possible to leak resources, since any resource not accessible from your application's roots can be collected at some point. Second, it is not possible to access a resource that is freed, since the resource won't be freed if it is reachable. If it's not reachable, then your application has no way to access it. The code in Figure 4 demonstrates how resources are allocated and managed.
If GC is so great, you might be wondering why it isn't in ANSI C++. The reason is that a garbage collector must be able to identify an application's roots and must also be able to find all object pointers. The problem with C++ is that it allows casting a pointer from one type to another, and there's no way to know what a pointer refers to. In the common language runtime, the managed heap always knows the actual type of an object, and the metadata information is used to determine which members of an object refer to other objects.

Finalization
The garbage collector offers an additional feature that you may want to take advantage of: finalization. Finalization allows a resource to gracefully clean up after itself when it is being collected. By using finalization, a resource representing a file or network connection is able to clean itself up properly when the garbage collector decides to free the resource's memory.
Here is an oversimplification of what happens: when the garbage collector detects that an object is garbage, the garbage collector calls the object's Finalize method (if it exists) and then the object's memory is reclaimed. For example, let's say you have the following type (in C#):

public class BaseObj {
public BaseObj() {
}

protected override void Finalize() {
// Perform resource cleanup code here...
// Example: Close file/Close network connection
Console.WriteLine("In Finalize.");
}
}

Now you can create an instance of this object by calling:

BaseObj bo = new BaseObj();

Some time in the future, the garbage collector will determine that this object is garbage. When that happens, the garbage collector will see that the type has a Finalize method and will call the method, causing "In Finalize" to appear in the console window and reclaiming the memory block used by this object.
Many developers who are used to programming in C++ draw an immediate correlation between a destructor and the Finalize method. However, let me warn you right now: object finalization and destructors have very different semantics and it is best to forget everything you know about destructors when thinking about finalization. Managed objects never have destructors—period.
When designing a type it is best to avoid using a Finalize method. There are several reasons for this:

Finalizable objects get promoted to older generations, which increases memory pressure and prevents the object's memory from being collected when the garbage collector determines the object is garbage. In addition, all objects referred to directly or indirectly by this object get promoted as well. Generations and promotions will be discussed in Part 2 of this article.

Finalizable objects take longer to allocate.

Forcing the garbage collector to execute a Finalize method can significantly hurt performance. Remember, each object is finalized. So if I have an array of 10,000 objects, each object must have its Finalize method called.

Finalizable objects may refer to other (non-finalizable) objects, prolonging their lifetime unnecessarily. In fact, you might want to consider breaking a type into two different types: a lightweight type with a Finalize method that doesn't refer to any other objects, and a separate type without a Finalize method that does refer to other objects.

You have no control over when the Finalize method will execute. The object may hold on to resources until the next time the garbage collector runs.

When an application terminates, some objects are still reachable and will not have their Finalize method called. This can happen if background threads are using the objects or if objects are created during application shutdown or AppDomain unloading. In addition, by default, Finalize methods are not called for unreachable objects when an application exits so that the application may terminate quickly. Of course, all operating system resources will be reclaimed, but any objects in the managed heap are not able to clean up gracefully. You can change this default behavior by calling the System.GC type's RequestFinalizeOnShutdown method. However, you should use this method with care since calling it means that your type is controlling a policy for the entire application.

The runtime doesn't make any guarantees as to the order in which Finalize methods are called. For example, let's say there is an object that contains a pointer to an inner object. The garbage collector has detected that both objects are garbage. Furthermore, say that the inner object's Finalize method gets called first. Now, the outer object's Finalize method is allowed to access the inner object and call methods on it, but the inner object has been finalized and the results may be unpredictable. For this reason, it is strongly recommended that Finalize methods not access any inner, member objects.

If you determine that your type must implement a Finalize method, then make sure the code executes as quickly as possible. Avoid all actions that would block the Finalize method, including any thread synchronization operations. Also, if you let any exceptions escape the Finalize method, the system just assumes that the Finalize method returned and continues calling other objects' Finalize methods.
When the compiler generates code for a constructor, the compiler automatically inserts a call to the base type's constructor. Likewise, when a C++ compiler generates code for a destructor, the compiler automatically inserts a call to the base type's destructor. However, as I've said before, Finalize methods are different from destructors. The compiler has no special knowledge about a Finalize method, so the compiler does not automatically generate code to call a base type's Finalize method. If you want this behavior—and frequently you do—then you must explicitly call the base type's Finalize method from your type's Finalize method:

public class BaseObj {
public BaseObj() {
}

protected override void Finalize() {
Console.WriteLine("In Finalize.");
base.Finalize(); // Call base type's Finalize
}
}

Note that you'll usually call the base type's Finalize method as the last statement in the derived type's Finalize method. This keeps the object alive as long as possible. Since calling a base type Finalize method is common, C# has a syntax that simplifies your work. In C#, the following code

class MyObject {
~MyObject() {
•••
}
}

causes the compiler to generate this code:

class MyObject {
protected override void Finalize() {
•••
base.Finalize();
}
}

Note that this C# syntax looks identical to the C++ language's syntax for defining a destructor. But remember, C# doesn't support destructors. Don't let the identical syntax fool you.

Finalization Internals
On the surface, finalization seems pretty straightforward: you create an object and when the object is collected, the object's Finalize method is called. But there is more to finalization than this.
When an application creates a new object, the new operator allocates the memory from the heap. If the object's type contains a Finalize method, then a pointer to the object is placed on the finalization queue. The finalization queue is an internal data structure controlled by the garbage collector. Each entry in the queue points to an object that should have its Finalize method called before the object's memory can be reclaimed.
Figure 5 shows a heap containing several objects. Some of these objects are reachable from the application's roots, and some are not. When objects C, E, F, I, and J were created, the system detected that these objects had Finalize methods and pointers to these objects were added to the finalization queue.

When a GC occurs, objects B, E, G, H, I, and J are determined to be garbage. The garbage collector scans the finalization queue looking for pointers to these objects. When a pointer is found, the pointer is removed from the finalization queue and appended to the freachable queue (pronounced "F-reachable"). The freachable queue is another internal data structure controlled by the garbage collector. Each pointer in the freachable queue identifies an object that is ready to have its Finalize method called.
After the collection, the managed heap looks like Figure 6. Here, you see that the memory occupied by objects B, G, and H has been reclaimed because these objects did not have a Finalize method that needed to be called. However, the memory occupied by objects E, I, and J could not be reclaimed because their Finalize method has not been called yet.

There is a special runtime thread dedicated to calling Finalize methods. When the freachable queue is empty (which is usually the case), this thread sleeps. But when entries appear, this thread wakes, removes each entry from the queue, and calls each object's Finalize method. Because of this, you should not execute any code in a Finalize method that makes any assumption about the thread that's executing the code. For example, avoid accessing thread local storage in the Finalize method.
The interaction of the finalization queue and the freachable queue is quite fascinating. First, let me tell you how the freachable queue got its name. The f is obvious and stands for finalization; every entry in the freachable queue should have its Finalize method called. The "reachable" part of the name means that the objects are reachable. To put it another way, the freachable queue is considered to be a root just like global and static variables are roots. Therefore, if an object is on the freachable queue, then the object is reachable and is not garbage.
In short, when an object is not reachable, the garbage collector considers the object garbage. Then, when the garbage collector moves an object's entry from the finalization queue to the freachable queue, the object is no longer considered garbage and its memory is not reclaimed. At this point, the garbage collector has finished identifying garbage. Some of the objects identified as garbage have been reclassified as not garbage. The garbage collector compacts the reclaimable memory and the special runtime thread empties the freachable queue, executing each object's Finalize method.


The next time the garbage collector is invoked, it sees that the finalized objects are truly garbage, since the application's roots don't point to it and the freachable queue no longer points to it. Now the memory for the object is simply reclaimed. The important thing to understand here is that two GCs are required to reclaim memory used by objects that require finalization. In reality, more than two collections may be necessary since the objects could get promoted to an older generation. Figure 7 shows what the managed heap looks like after the second GC.

Resurrection
The whole concept of finalization is fascinating. However, there is more to it than what I've described so far. You'll notice in the previous section that when an application is no longer accessing a live object, the garbage collector considers the object to be dead. However, if the object requires finalization, the object is considered live again until it is actually finalized, and then it is permanently dead. In other words, an object requiring finalization dies, lives, and then dies again. This is a very interesting phenomenon called resurrection. Resurrection, as its name implies, allows an object to come back from the dead.
I've already described a form of resurrection. When the garbage collector places a reference to the object on the freachable queue, the object is reachable from a root and has come back to life. Eventually, the object's Finalize method is called, no roots point to the object, and the object is dead forever after. But what if an object's Finalize method executed code that placed a pointer to the object in a global or static variable?

public class BaseObj {

protected override void Finalize() {
Application.ObjHolder = this;
}
}

class Application {
static public Object ObjHolder; // Defaults to null
•••
}

In this case, when the object's Finalize method executes, a pointer to the object is placed in a root and the object is reachable from the application's code. This object is now resurrected and the garbage collector will not consider the object to be garbage. The application is free to use the object, but it is very important to note that the object has been finalized and that using the object may cause unpredictable results. Also note: if BaseObj contained members that pointed to other objects (either directly or indirectly), all objects would be resurrected, since they are all reachable from the application's roots. However, be aware that some of these other objects may also have been finalized.
In fact, when designing your own object types, objects of your type can get finalized and resurrected totally out of your control. Implement your code so that you handle this gracefully. For many types, this means keeping a Boolean flag indicating whether the object has been finalized or not. Then, if methods are called on your finalized object, you might consider throwing an exception. The exact technique to use depends on your type.
Now, if some other piece of code sets Application.ObjHolder to null, the object is unreachable. Eventually the garbage collector will consider the object to be garbage and will reclaim the object's storage. Note that the object's Finalize method will not be called because no pointer to the object exists on the finalization queue.
There are very few good uses of resurrection, and you really should avoid it if possible. However, when people do use resurrection, they usually want the object to clean itself up gracefully every time the object dies. To make this possible, the GC type offers a method called ReRegisterForFinalize, which takes a single parameter: the pointer to an object.

public class BaseObj {

protected override void Finalize() {
Application.ObjHolder = this;
GC.ReRegisterForFinalize(this);
}
}

When this object's Finalize method is called, it resurrects itself by making a root point to the object. The Finalize method then calls ReRegisterForFinalize, which appends the address of the specified object (this) to the end of the finalization queue. When the garbage collector detects that this object is unreachable again, it will queue the object's pointer on the freachable queue and the Finalize method will get called again. This specific example shows how to create an object that constantly resurrects itself and never dies, which is usually not desirable. It is far more common to conditionally set a root to reference the object inside the Finalize method.
Make sure that you call ReRegisterForFinalize no more than once per resurrection, or the object will have its Finalize method called multiple times. This happens because each call to ReRegisterForFinalize appends a new entry to the end of the finalization queue. When an object is determined to be garbage, all of these entries move from the finalization queue to the freachable queue, calling the object's Finalize method multiple times.

Forcing an Object to Clean Up
If you can, you should try to define objects that do not require any clean up. Unfortunately, for many objects, this is simply not possible. So for these objects, you must implement a Finalize method as part of the type's definition. However, it is also recommended that you add an additional method to the type that allows a user of the type to explicitly clean up the object when they want. By convention, this method should be called Close or Dispose.
In general, you use Close if the object can be reopened or reused after it has been closed. You also use Close if the object is generally considered to be closed, such as a file. On the other hand, you would use Dispose if the object should no longer be used at all after it has been disposed. For example, to delete a System.Drawing.Brush object, you call its Dispose method. Once disposed, the Brush object cannot be used, and calling methods to manipulate the object may cause exceptions to be thrown. If you need to work with another Brush, you must construct a new Brush object.
Now, let's look at what the Close/Dispose method is supposed to do. The System.IO.FileStream type allows the user to open a file for reading and writing. To improve performance, the type's implementation makes use of a memory buffer. Only when the buffer fills does the type flush the contents of the buffer to the file. Let's say that you create a new FileStream object and write just a few bytes of information to it. If these bytes don't fill the buffer, then the buffer is not written to disk. The FileStream type does implement a Finalize method, and when the FileStream object is collected the Finalize method flushes any remaining data from memory to disk and then closes the file.
But this approach may not be good enough for the user of the FileStream type. Let's say that the first FileStream object has not been collected yet, but the application wants to create a new FileStream object using the same disk file. In this scenario, the second FileStream object will fail to open the file if the first FileStream object had the file open for exclusive access. The user of the FileStream object must have some way to force the final memory flush to disk and to close the file.
If you examine the FileStream type's documentation, you'll see that it has a method called Close. When called, this method flushes the remaining data in memory to the disk and closes the file. Now the user of a FileStream object has control of the object's behavior.
But an interesting problem arises now: what should the FileStream's Finalize method do when the FileStream object is collected? Obviously, the answer is nothing. In fact, there is no reason for the FileStream's Finalize method to execute at all if the application has explicitly called the Close method. You know that Finalize methods are discouraged, and in this scenario you're going to have the system call a Finalize method that should do nothing. It seems like there ought to be a way to suppress the system's calling of the object's Finalize method. Fortunately, there is. The System.GC type contains a static method, SuppressFinalize, that takes a single parameter, the address of an object.
Figure 8 shows FileStream's type implementation. When you call SuppressFinalize, it turns on a bit flag associated with the object. When this flag is on, the runtime knows not to move this object's pointer to the freachable queue, preventing the object's Finalize method from being called.
Let's examine another related issue. It is very common to use a StreamWriter object with a FileStream object.

FileStream fs = new FileStream("C:\\SomeFile.txt",
FileMode.Open, FileAccess.Write, FileShare.Read);
StreamWriter sw = new StreamWriter(fs);
sw.Write ("Hi there");

// The call to Close below is what you should do
sw.Close();
// NOTE: StreamWriter.Close closes the FileStream. The FileStream
// should not be explicitly closed in this scenario

Notice that the StreamWriter's constructor takes a FileStream object as a parameter. Internally, the StreamWriter object saves the FileStream's pointer. Both of these objects have internal data buffers that should be flushed to the file when you're finished accessing the file. Calling the StreamWriter's Close method writes the final data to the FileStream and internally calls the FileStream's Close method, which writes the final data to the disk file and closes the file. Since StreamWriter's Close method closes the FileStream object associated with it, you should not call fs.Close yourself.
What do you think would happen if you removed the two calls to Close? Well, the garbage collector would correctly detect that the objects are garbage and the objects would get finalized. But, the garbage collector doesn't guarantee the order in which the Finalize methods are called. So if the FileStream gets finalized first, it closes the file. Then when the StreamWriter gets finalized, it would attempt to write data to the closed file, raising an exception. Of course, if the StreamWriter got finalized first, then the data would be safely written to the file.
How did Microsoft solve this problem? Making the garbage collector finalize objects in a specific order is impossible because objects could contain pointers to each other and there is no way for the garbage collector to correctly guess the order to finalize these objects. So, here is Microsoft's solution: the StreamWriter type doesn't implement a Finalize method at all. Of course, this means that forgetting to explicitly close the StreamWriter object guarantees data loss. Microsoft expects that developers will see this consistent loss of data and will fix the code by inserting an explicit call to Close.
As stated earlier, the SuppressFinalize method simply sets a bit flag indicating that the object's Finalize method should not be called. However, this flag is reset when the runtime determines that it's time to call a Finalize method. This means that calls to ReRegisterForFinalize cannot be balanced by calls to SuppressFinalize. The code in Figure 9 demonstrates exactly what I mean.
ReRegisterForFinalize and SuppressFinalize are implemented the way they are for performance reasons. As long as each call to SuppressFinalize has an intervening call to ReRegisterForFinalize, everything works. It is up to you to ensure that you do not call ReRegisterForFinalize or SuppressFinalize multiple times consecutively, or multiple calls to an object's Finalize method can occur.

Conclusion
The motivation for garbage-collected environments is to simplify memory management for the developer. The first part of this overview looked at some general GC concepts and internals. In Part 2, I will conclude this discussion. First, I will explore a feature called WeakReferences, which you can use to reduce the memory pressure placed on the managed heap by large objects. Then I'll examine a mechanism that allows you to artificially extend the lifetime of a managed object. Finally, I'll wrap up by discussing various aspects of the garbage collector's performance. I'll discuss generations, multithreaded collections, and the performance counters that the common language runtime exposes, which allow you to monitor the garbage collector's real-time behavior.

Monday, September 12, 2005

Convert Custom Objects in XML Data Using Reflection

//Created By Kaushal
public string ConvertToXML()
{


//output will used to store parent object's xml representation

StringBuilder output = new StringBuilder();

//output will used to store child object's xml representation

StringBuilder refOutput = new StringBuilder();

//I am considering Object Node is Root Node..


output.Append("");

//Getting Count of parent Objects

for (int i=0;i<=this.Count-1;i++)
{

//Getting Properties Info using Reflection

PropertyInfo[] objFields = this.List[i].GetType().GetProperties();

//Adding Reflected Object Type in to XML String as Parent Node of an Entity as Node Name

output.Append("<" + objFields[0].ReflectedType.Name.ToString() + ">");

//Appending Data as seprate Attribute in the node

output.Append("<" + objFields[0].ReflectedType.Name.ToString());

// Accessing each property

foreach(PropertyInfo prop in objFields)
{

//Checking that the current property is Readable or not

if (prop.CanRead)
{

//If yes then checking the return type of the property. If its Value Type or Normal String I am handling it in the ELSE part.

if ((!prop.PropertyType.IsValueType) && (prop.PropertyType !=typeof(System.String)) && (prop.PropertyType !=typeof(Framework.CError)) )

{
//Getting Value from Reflected Object
object oCustomObject = prop.GetValue(this[i] as object,null);

//Appending object's value into XML

if (null!=oCustomObject)
{
//if object type is ref. type then it will go for recursion call in the next private function.

refOutput.Append(ConvertToXML(oCustomObject));
}
else
{
//Getting data of property in case of value type or String
Object strObjectString = prop.GetValue(this.List[i] as object,null);

if (null!=strObjectString)
{
output.Append(" " + prop.Name.ToString() + "='");
output.Append(strObjectString.ToString()+ "'");
}
}
}
}
output.Append("/>");
output.Append(refOutput.ToString());
refOutput.Remove(0,refOutput.Length);
output.Append(" output = output.Replace("Key","ID");
}
output.Append("");
return output.ToString();
}

//Created By Kaushal
private string ConvertToXML(object oUserObject)
{

StringBuilder output = new StringBuilder();
StringBuilder refOutput = new StringBuilder();
Type objectType = oUserObject.GetType().BaseType;
int iCount = objectType==typeof(System.Object)? 1 : (oUserObject as EntityCollectionBase).Count;
output.Append(objectType==typeof(System.Object)? "<" + oUserObject.GetType().Name + ">" : "");

for (int i=0;i<=iCount-1;i++)
{
PropertyInfo[] objFields = objectType==typeof(System.Object)? oUserObject.GetType().GetProperties() : (oUserObject as EntityCollectionBase)[i].GetType().GetProperties();
Object oTmpObject = objectType==typeof(System.Object)? oUserObject : (oUserObject as EntityCollectionBase)[i] as object;
output.Append("<" + objFields[0].ReflectedType.Name.ToString() + ">");
output.Append("<" + objFields[0].ReflectedType.Name.ToString());

foreach(PropertyInfo prop in objFields)
{
if (prop.CanRead)
{
if ((!prop.PropertyType.IsValueType) && (prop.PropertyType !=typeof(System.String)) && (prop.PropertyType !=typeof(Framework.CError)) )
{
object oCustomObject = prop.GetValue(oTmpObject as object,null) as object;
if (null!=oCustomObject)
//Making Nested Calls
refOutput.Append(ConvertToXML(oCustomObject));
}
else
{
Object strObjectString = prop.GetValue(oTmpObject,null);
if (null!=strObjectString)
{
output.Append(" " + prop.Name.ToString() + "='");
output.Append(strObjectString.ToString()+ "'");
}
}
}
}
output.Append("/>");
output.Append(refOutput.ToString());
refOutput.Remove(0,refOutput.Length);
output.Append("
output = output.Replace("Key","ID");
}

output.Append(objectType==typeof(System.Object)? "return output.ToString();

}


4 More Details Mail me @ mail2kaushal@gmail.com

Thursday, September 08, 2005

Generate Property Classes On Fly : Kaushal Patel

Class Used : PropertyBuilder [PropertyBuilder Class defines the properties for a type.]

Code :


using System;
using System.Reflection;
using System.Reflection.Emit;
using System.Threading;


class PropertyBuilderDemo
{
public static Type BuildProperties()
{
AppDomain _Domain = Thread.GetDomain();
AssemblyName _AssemblyName = new AssemblyName();
_AssemblyName.Name = "DynamicAssembly";

AssemblyBuilder _AssemblyBuilder = _Domain.DefineDynamicAssembly(_AssemblyName,AssemblyBuilderAccess.RunAndSave);

ModuleBuilder _ModuleBuilder = _AssemblyBuilder.DefineDynamicModule("DynamicModule");
TypeBuilder _TypeBuilder = _ModuleBuilder.DefineType("CustomerData",TypeAttributes.Public);
FieldBuilder _customerName = _TypeBuilder.DefineField("_CustomerName",typeof(string),FieldAttributes.Private);
PropertyBuilder _CustomerName = _TypeBuilder.DefineProperty("CustomerName",PropertyAttributes.HasDefault,typeof(string),new Type[] { typeof(string) });
//Assigning 'Get' Behavior to Customer Name Property

MethodBuilder _GetCustomerName = _TypeBuilder.DefineMethod("GetCustomerName",MethodAttributes.Public,typeof(string),new Type[] { });

ILGenerator _GetILGenerator = _GetCustomerName.GetILGenerator();
_GetILGenerator.Emit(OpCodes.Ldarg_0);
_GetILGenerator.Emit(OpCodes.Ldfld,_customerName);
_GetILGenerator.Emit(OpCodes.Ret);
//Assigning 'Set' Behavior to Customer Name Property
MethodBuilder _SetCustomerName = _TypeBuilder.DefineMethod("SetCustomerName",MethodAttributes.Public,null,new Type[] { typeof(string) });

ILGenerator _SetILGenerator = _SetCustomerName.GetILGenerator();
_SetILGenerator.Emit(OpCodes.Ldarg_0);
_SetILGenerator.Emit(OpCodes.Ldarg_1);
_SetILGenerator.Emit(OpCodes.Stfld,_customerName);
_SetILGenerator.Emit(OpCodes.Ret);
//Attaching Get and Set behaviors to property
_CustomerName.SetGetMethod(_GetCustomerName);
_CustomerName.SetSetMethod(_SetCustomerName);

return _TypeBuilder.CreateType();
}
}


///Client Code

static void Main(string[] args)
{
Type customer = PropertyBuilderDemo.BuildProperties();
PropertyInfo[] properties = customer.GetProperties();
object customerData = Activator.CreateInstance(customer);
properties[0].SetValue(customerData,"CREATED BY : Kaushal Patel",BindingFlags.SetProperty,null,null,null);
string s = properties[0].GetValue(customerData,BindingFlags.GetProperty,null,null,null).ToString();
Console.WriteLine(s.ToString());
Console.ReadLine();
}

4 More Information Do Mail Me @ mail2kaushal@gmail.com

Tuesday, September 06, 2005

The Importance of Using Managed Code in .NET

Introduction


What is managed code and why is it important to use 100% managed
code in .NET applications?



Managed code is compiled for the .NET run-time environment. It
runs in the Common Language Runtime (CLR), which is the heart of
the .NET Framework. The CLR provides services such as security,
memory management, and cross-language integration. (3) Managed
applications written to take advantage of the features of the CLR
perform more efficiently



and safely, and take better advantage of developers’
existing expertise in languages that support the .NET
Framework.



Unmanaged code includes all code written before the .NET
Framework was introduced—this includes code written to use
COM, native Win32, and Visual Basic 6. Because it does not run
inside the .NET environment, unmanaged



code cannot make use of any .NET managed facilities. (1)



Advantages of Using Managed Code


Managed code runs entirely “inside the sandbox,”
meaning that it makes no



calls outside of the .NET Framework. That’s why managed
code gets the



maximum benefit from the features of the .NET Framework, and
why



applications built with managed code perform more safely and
efficiently.



Performance


The CLR was designed from the start to provide good performance.
By using 100% managed code, you can take advantage of the numerous
built-in services of the CLR to enhance the
performance of your managed



application. Because of the runtime services and checks that the
CLR performs, applications do not need to include separate versions
of these services. (9) And by using 100% managed code, you
eliminate the



performance costs associated with calling unmanaged code.



Just-In-Time compiler


The CLR never executes Common Intermediate Language (CIL)
directly.



Instead, the Just-In-Time (JIT) compiler translates CIL into
optimized x86



native instructions. (9) That’s why using managed code
lets your software run



in different environments safely and efficiently. In addition,
using machine



language lets you take full advantage of the features of the
processor the




application is running on. For example, when the JIT encounters
an Intel



processor, the code produced takes advantage of
hyper-threading



technology. (5)



Another advantage of the JIT is improved performance. The JIT
learns when



the code
does multiple iterations. The runtime is designed to be able to



retune the JIT compiled code as your program runs. (2)



NGEN utility


NGEN.exe is a .NET utility that pre-compiles the application at
install time.



Pre-compiling improves start-up performance for managed code,
especially



when the application uses Windows Forms. Methods are JITed when
they



are first used, incurring a larger startup penalty if the
application calls many



methods during start-up. Because Windows Forms uses many
shared



libraries in the operating system, pre-compiling Windows Forms
applications



usually improves performance. (12)



Pre-compiling also makes sure that the application is optimized
for the



machine on which it is being installed.



Maintaining a 100% managed code environment


Only when your .NET application uses components that are built
using 100%



managed code do you receive the full benefits of the .NET
environment.



For example, when accessing data through ADO.NET, using wire
protocol


.NET data providers lets you preserve your managed code
environment



because they do not make calls to native Win32 APIs and Client
pieces.



The performance advantages of the managed code environment are
lost



when you (or the components you are using) call unmanaged code.
The CLR



makes additional checks on calls to the unmanaged or native
code, which



impacts performance.



Unmanaged code includes the database client pieces that some
.NET data



providers require. Examples of .NET data providers that use both
managed



and unmanaged code are IBM’s DB2 data provider and the
Oracle Data



Provider for .NET (ODP.NET). Both of these data providers must
use client



libraries to access the database. The data providers shipped
Microsoft for



SQL Server and Oracle—as well as the Microsoft OLE DB data
providers,



and ODBC.NET—make calls to native Win32 database client
pieces or other



unmanaged code.



Automatic memory management


Automatic memory management is one of the most significant
features of



managed code. The CLR garbage collector automatically frees
allocated



objects when there are no longer any outstanding references to
them. The



developer does not need to explicitly free memory assigned to an
object,



which goes a long way toward reducing the amount of time spent
debugging



memory leaks. (10) There can be no memory leaks in 100% managed
code.




Automatic lifetime control of objects


Another significant advantage of using managed code is that the
CLR



provides automatic lifetime management of components and
modules.



Lifetime control includes:



• Garbage collection, which frees and compacts
memory.



• Scalability features, such as thread pooling
and the ability to use a nonpersistent



connection with a dataset.



• Support for side-by-side versions.



Garbage collection


When an object is created with the new operator,
the runtime allocates



memory from the managed heap. Periodically, the CLR garbage
collector



checks the heap and automatically disposes of any objects that
are no longer



being used by the application, reclaiming their memory.



The garbage collector also compacts the released memory,
reducing



fragmentation. (4) This function is particularly important when
the application



runs on large memory servers. Changing the application to use
smaller



objects can help to improve the effectiveness of the garbage
collector.



Similarly, because each DLL is assigned a 64-bit chunk of
memory,



combining small DLLs avoids inefficient use of memory.



Because the garbage collector automatically closes unused
objects, memory



leaks are not possible in an application that uses 100% managed
code.



Scalability features


Thread pooling lets you make much more efficient use of multiple
threads



and is an important scalability feature of using managed code.
The .NET



Framework comes with built-in support for creating threads and
using the



system-provided thread pool. In particular, the ThreadPool class
under the



System.Threading namespace provides static methods for
submitting



requests to the thread pool. In managed code, if one of the
threads becomes



idle, the thread pool injects another worker thread into the
multithread



apartment to keep all the processors busy.



The standard ThreadPool methods capture the caller’s stack
and merge it



into the stack of the thread-pool thread when the thread-pool
thread starts to



execute a task. If you are using unmanaged code, the entire
stack will be



checked, which incurs a performance cost. In some cases, you can
eliminate



the stack checking with the Unsafe methods



ThreadPool.UnsafeQueueUserWorkItem and



ThreadPool.UnsafeRegisterWaitForSingleObject, which provide
better



performance. However, using the Unsafe method calls does not
provide



complete safety. (8)



Further adding to scalability is the ability to use a
non-persistent connection



with a dataset, which is a cache of the records retrieved from
the database.



The dataset keeps track of the state of the data and stores the
data as pure



XML. Database connections are opened and closed only as needed
to



retrieve data into the dataset, or to return updated data.
(7)



Versioning


Versioning essentially eliminates “DLL hell.” When
you define an assembly as



strongly named, the .NET executable will be executed with the
same DLL



with which it was built. This means that you can have
side-by-side versions of



a DLL, allowing you to manage shared components. Versioning
ensures that



each time an application starts up, it checks its shared files.
If a file has



changed and the changes are incompatible, the application can
ask the



runtime for a compatible version.



However, when an application calls unmanaged DLLs, you can end
up back



in “DLL hell.” For example, Oracle’s ODP.NET
data provider calls the



unmanaged Oracle Client pieces, which are specific to a
particular version of



Oracle. You could install two versions of this unmanaged
data provider, for



example, one for Oracle9i and one for the upcoming Oracle10G,
but you



would have a conflict, because each data provider will require a
particular



version of the clients. Since the clients are native Win32 DLLs,
you cannot



easily have side-by-side versions running on the same machine.
Only with



native wire protocol data providers built from 100% managed code
can you



install side-by-side versions with no configuration required by
the end-user.



Checks by the .NET runtime


The .NET runtime automatically performs numerous checks to
ensure that



code is written correctly. Because these checks prevent a large
number of



bugs from ever happening, developer productivity is improved and
the



application quality is better. In addition, these checks thwart
system attacks



such as the exploitation of buffer overruns.



The CLR checks for type safety to ensure that applications
always access



allocated objects in appropriate ways. In other words, if a
method input



parameter is declared as accepting a 4-byte value, the common
language



runtime will detect and trap attempts to access the parameter as
an 8-byte



value. Type safety also means that execution flow will only
transfer to known



method entry points. There is no way to construct an arbitrary
reference to a



memory location and cause code at that location to begin
execution.



In addition, array indexes are checked to be sure they are in
the range of the



array. For example, if an object occupies 10 bytes in memory,
the application



can’t change the object so that it will allow more than 10
bytes to be read.



(11)



Cross-language integration


You can write .NET applications in many different languages,
such as C,



C++, Visual Basic, COBOL, Fortran, Perl, Pascal, Jscript, Lisp,
Python,



Smalltalk, and others. Programmers can use the languages that
they are



most proficient with to develop portions of an application.




All CLR-compliant languages compile to Common Intermediate
Language



(CIL). CIL is the key to making the .NET application
platform-neutral and



hardware independent.



In addition, programmers can choose specific languages for
specific tasks



within the same application. Some languages are stronger than
others for



particular tasks, and programmers can choose the language best
suited for



the task. The originating language doesn’t matter, because
all .NETcompliant



compilers produce CIL.



Platform-neutrality


A managed .NET application can execute on any Windows platform
that



supports the .NET common language runtime. Currently, these
platforms are



Windows 98, Windows 2000, Windows Me, Windows NT, Windows XP,
and



Windows Server 2003 (32-bit). Support for the .NET Framework
and



Common Language Runtime on Windows Server 2003 (64-bit) is
planned for



an upcoming release.



In addition, with the Microsoft Mobile Internet Toolkit,
developers can create a



.NET compliant, mobile Web application that can be adapted to
the display of



multiple wireless devices. (6)



Security


Managed code does not have direct access to memory, machine
registers, or



pointers. The .NET Framework security enforces security
restrictions on



managed code that protects the code and data from being misused
or



damaged by other code. An administrator can define a security
policy to grant



or revoke permissions on an enterprise, a machine, an assembly,
or a user



level. For these reasons, applications that use managed code are
much



safer.



Code access security


Code access security lets the administrator specify which
operations a piece



of code can perform, stopping inappropriate behavior before it
can start. You



can configure a complex set of rules to:



Specify whether a code group can both read and write files



Demand that the code’s callers have specific
permissions



Allow only callers from a particular organization or site to
call the code



Grant permissions to each assembly that is loaded



Compare the granted permissions
of every caller on the call stack at



runtime to the permissions that callers must have and which
resources the



code can access. (6)



The access privileges an administrator assigns depend in part on
where the



application is running. For example, by default, an application
that runs from



the local computer has a higher level of trust and more
privileges, such as



accessing the file system, than an application that is running
from the



Internet.



Calling unmanaged code bypasses the .NET CLR security. An
application



that calls unmanaged code doesn’t necessarily have a
security problem—it



simply has an open door to the possibility of problems due to
the functionality



of the unmanaged code that has direct access to memory or
machine



registers, or uses pointers. Once the unmanaged code is being
executed, the



CLR can no longer check it.



Avoiding buffer overruns


One common type of attack attempts to make API methods operate
out of



specification, causing a buffer overrun. This attack typically
passes



unexpected parameters, such as an out-of-range index or offset
value.



Managed code avoids the buffer overruns that trigger so many
security



snafus.



Buffer overruns usually occur in programs written in languages
such as C or



C++, which do not check array bounds and type safety. If an
application does



not check the validity of the destination buffer size and other
parameters, the



copied data might overrun the buffer, overwriting the data in
adjacent



addresses.



Buffer overruns are theoretically impossible in managed
code.



Summary


Using 100% managed code gives you solid performance, improved
security,



and fewer bugs. The CLR provides memory management and lifetime
control



of objects, including scalability features and versioning. When
you call



unmanaged code, you lose many of the valuable benefits of the
.NET



environment.



References


1.
Gentile, Sam. “Intro to Managed C++, Part 2: Mixing Managed
and



Unmanaged Code.” The O’Reilly Network.




http://www.ondotnet.com/lpt/a/3226
<08/20/2003>



2. Gray, Jan. “Writing Faster
Managed Code: Know What Things Cost.”



MSDN Library. http://msdn.microsoft.com/library/?url=/library/enus/


dndotnet/html/fastmanagedcode.asp
<08/20/2003>



3. Gregory, Kate. “Managed,
Unmanaged, Native: What Kind of Code Is



This?” http://www.developer.com/net/cplus/print.php/2197621


<08/20/2003>



4.
Mariani, Rico. “Garbage Collector Basics and Performance
Hints.” MSDN



Library. April 2003.




http://msdn.microsoft.com/library/default.asp?url=/library/enus/



dndotnet/html/dotnetgcbasics.asp
<08/20/2003>



5. McNaughton, Allan. “Boosting
the Performance of Microsoft .NET.” MSDN



Library.
http://msdn.microsoft.com/library/default.asp?url=/library/enus/


dndotnet/html/optimaldotnet.asp
<08/20/2003>



6. Microsoft Corporation.
“Deployment Guide for the Microsoft Mobile Internet



Toolkit.”
http://msdn.microsoft.com/library/default.asp?url=/library/enus/


dnmitta/html/deploymobilwebapp.asp?frame=true
<08/20/2003>



7. Microsoft. “The Windows
Server 2003 Application Environment.” MSDN



Library.
http://msdn.microsoft.com/library/default.asp?url=/library/enus/


dnnetserv/html/windowsnetserver.asp
<08/20/2003>



8.
Microsoft. .NET Framework Developer’s Guide .
Microsoft .NET Framework



SDK1.0. 2001.



9. Noriskin, Gregor. “Writing
High-Performance Managed Applications: A



Primer.” MSDN Library.




http://msdn.microsoft.com/library/default.asp?url=/library/enus/



dndotnet/html/optimaldotnet.asp
<08/04/2003>



10. Platt, David S. Introducing Microsoft .NET .
Microsoft Press. Redmond, WA.



2001.



11. Richter,
Jeffrey. “Microsoft .NET Framework Delivers the Platform for
an



Integrated, Service-Oriented Web.” MSDN Magazine .





http://msdn.microsoft.com/msdnmag/issues/0900/Framework/default.aspx



<08/20/2003>



12. Schanzer, Emmanuel. “Performance Tips and Tricks in
.NET Applications.”



MSDN Library.
http://msdn.microsoft.com/library/default.asp?url=/library/enus/