Sorting .Net Collections
(by John Roberts for Blayd Software)
Note: this article relates to the .Net Framework version 2.0, some
of the classes and class members referenced in this article were not available in
earlier framework versions.
Introduction
Collections with sorting capabilities are required in many situations, for example,
we may want to provide advanced or more efficient searching capabilities, such as
binary searching, which require the collection to be sorted, we may want to provide
greater flexibility to users of the collection or we may want to provide application
developers with the functionality to data bind the collection to a user interface
control that supports sorting. For collections or lists that are only used internally,
within a component, the .Net Framework provides a few, ready to use, classes that
provide sorting functionality. The basic Array class has been in the framework from
version 1.0 and provides sorting and a binary search method and from version 2.0
of the .Net Framework has provided support for generics.
The SortedDictionary<TKey, TValue> and SortedList<TKey, TValue>
classes, from the System.Collections.Generic namespace both store their data is sorted
order, based on their keys but neither class provides (explicit) sort or binary search methods. The
System.Collections.Generic.List<T> class provides (explicit) sorting and a binary search
method. The List<T> class uses the Array class, internally, to provide its sorting functionality
and is fine for use internally within a component, however, it relies on the caller
to supply a comparer, rather than specifying the required sort property, when sorting
multi-property business objects and therefore it may not be ideal for public exposure
from within a component.
Extending the List<t> class to provide a collection with sorting capability from a component is a
possibility, however, if you don't like the idea of exposing the existing List<T>
sort methods to external callers and we don't, then a different solution is needed. If you are
wondering why we don't like the idea of exposing the List<T> sort methods, it
is because we prefer to expose the sort methods in a more caller or user friendly way and there is also
the matter of controlling the sorting process. If we let the caller specify the comparer then we have
no way of knowing whether the sort criteria are valid and if they are not then any exception that occurs
in the comparer may pass through our sort method an will almost certainly leave our collection in an
unknown state. Fortunately there is a fairly simple way around this potential problem, derive from
System.Collections.ObjectModel.Collection<T> rather than List<T>.
The Collection<T> base class does not have any existing sort methods and
we can, therefore, implement our sort methods in any way we choose and as an added bonus we can, as you
will see later in the article, still use the List<T> class internally to sort the collection,
which, interestingly, means that we will ultimately be using the Array class to sort our collection!
In this article we going to discuss the development of a collection class with sorting
and binary search capabilities. The collection is primarily intended to be used
from code, however, as you will see later in the article, the collection can be
used by application developers to data bind to a user interface control that has
sorting capabilities e.g. the DataGridView control. If the primary objective was
to provide a collection with sorting and searching functionality for data binding
then developing a collection with an IBindingList implementation would be the preferred
solution. However, for the purpose of this article we are assuming that a full IBindingList
implementation would be overkill and our intention is to produce a collection that
may be used for data binding but that is not its primary purpose. If you are interested
in and want to know more about IBindingList then the Blayd Software
DataSource and BindingList article details the development of a BindingList<T>
derived collection class and a data source class that is used to supply the collection
to binding clients.
We will show as much of the source code as we have space for, however, we will not
able to show it all. You can download the full source code for this article using the link provided in
the "Article Options" on the left of the page. The source code shown in this
article will be in C# but the download sample code is available as either a C# or a
VB Visual Studio 2005 project.
Collected Class
In order to develop the collection class we need something to put in it i.e. the
collected class and for the purpose of this article we need a business type object
that implements several properties on which the collection can be sorted and to
make it more interesting and relevant the properties need to be of differing types
i.e. String, Integer, DateTime, etc. The
Identity class is to be our collected class,
instances of this class will be stored in the collection and the collection will
provide the functionality to allow callers to sort the collection on any of the
Identity class property values.
Partial listing of the Identity class:
public class Identity : IComparable<Identity>, IEquatable<Identity>
{
private long _id;
private string _title;
private string _firstName;
private string[] _middleNames;
private string _lastName;
private Gender _gender;
private DateTime _dateOfBirth;
public Identity() : base()
{
// Assign defaults to internal fields
// ...
}
protected internal Identity(long id, string title, string firstName,
string middleNames, string lastName,
Gender gender, DateTime dateOfBirth) : this()
{
// Validate and assign to internal fields
// ...
}
public int Age
{
get { return CalculateAge(); }
}
public DateTime DateOfBirth
{
get { return _dateOfBirth; }
set
{
// Validate and assign to internal field
// ...
}
}
public string FirstName
{
get { return _firstName; }
set { _firstName = (value == null) ? string.Empty : value; }
}
// Other properties
// ...
private string GetMiddleNames()
{
// Build middle names array into comma delimited string
// ...
}
public int CalculateAge()
{
// Calculate age from date of birth
// ...
}
int IComparable<Identity>.CompareTo(Identity other)
{
if (other == null)
{
// This object is greater...
return 1;
}
else
{
return _id.CompareTo(other.Id);
}
}
bool IEquatable<Identity>.Equals(Identity other)
{
if (other == null)
{
return false;
}
else
{
return (_id == other.Id);
}
}
public void UpdateMiddleNames(string middleNames)
{
// Split out middle names and load into middle names array
// ...
}
public void UpdateMiddleNames(string[] middleNames)
{
// Assign middle names to internal field
// ...
}
}
There is nothing particularly remarkable about the Identity class, it simply provides
the names, gender, date of birth, etc. of a fictitious (for the purpose of this
article) person. However, it does provide us with several property types to sort
on, Long, Integer, String, DateTime and the
Gender enumeration.
Comparer Class
As you will see later in this article we are going to utilize the List<T> class to perform the
sorting and searching. To make use of the List<T> sorting and searching functionality we will
need to supply a comparer to both the sort and the binary search methods. There are, typically,
several available options when specifying a comparer, the Comparison<T> delegate or an
IComparer<T> implementation either in a class derived from the Comparer<T> abstract base
class or in your own class. If you are searching or sorting on a single type of value, for example,
a String or an Integer then the StringComparer or Comparer<T>.Default (for the relevant type
and providing the type implements the IComparable<T> interface) would be available.
We want to provide the functionality to sort on any of the Identity class properties
and we also want to make use of the List<T> sort and binary search methods internally,
therefore we need to develop an implementation of the generic IComparer<T>
interface. Because we are sorting the list based on the value of a caller specified
Identity class property, our comparer needs to know what the sort property is and
it also needs to have the ability to retrieve the value of the specified property
from the two Identity instances passed to the Compare method. This seems like a
fairly straight forward requirement, however, it can be tricky to implement this
kind of functionality without creating either an explicit or an implicit dependency
between the comparer and the Identity class.
We are going to use the System.ComponentModel.PropertyDescriptor class to allow
callers to specify the required sort property because this technique is compatible
with the approach specified in the IBindingList interface and a PropertyDescriptor
instance is fairly easy for the caller to create and provides all of the information
the collection needs to sort its data. This is all starting to look pretty good,
we can pass the PropertyDescriptor to our comparer, in the constructor, to specify
which Identity property to use in the comparison. It gets even better when we examine
the members available in the PropertyDescriptor class, it has a PropertyType property
which returns the Type of the property it is representing and a GetValue method
that returns the value of the property for a specified instance of the Identity
class. However, there is one major drawback to using this approach, the GetValue
method is, typically, implemented using reflection and reflection calls are relatively
slow. The comparer can resolve the Type of the required sort property when it is
initialized as this value remains constant throughout the sort but it will have
to make a GetValue call on each Identity instance passed to its Compare method and
it will have to make these two calls on every call to its Compare method, therefore,
sorting a large list with a fairly random distribution of objects, based on the
sort property, is going to be relatively slow.
An alternative approach to the PropertyDescriptor, reflection, technique is to provide
our comparer class with knowledge of the Identity class. If the comparer knows about
the properties of the Identity class it can access the properties directly. However,
there is also a drawback to this approach, it introduces a dependency between the
comparer and the Identity class that may not be apparent to a developer who, at
sometime in the future, extends the Identity class by adding one or more new properties.
If the comparer class is not updated at the same time, it will not know about the
new properties and will not, therefore, be able to sort on any of the new properties.
Although this is, undoubtedly, a pain, it should be immediately obvious to the developer
making the changes to the Identity class as, surely, they will regression test the
sorting process.
We now have a choice to make, do we go for the PropertyDescriptor, reflection, technique
which requires very little code, is fairly intuitive and providing the Type of each
Identity property implements the IComparable<T>
interface as a fallback, does not introduce a dependency between the compare and
the Identity classes. Well no, we don't, in the tests that we have made, this technique
performs poorly in relation to the performance achieved by directly accessing the
Identity class properties. Performance degrades as the size of the collection grows
but the difference can be noticeable in relatively small collections e.g. around
50 objects, therefore, if your collection will never contain a lot of objects or
if the distribution of the objects is fairly regular you might take a different
view. However, even though there is more code required to resolve the PropertyDescriptor
to the relevant Identity class property, the code is not very pretty and remembering
to update the comparer when updating the Identity class is not ideal, we still prefer
the option that provides the best performance.
Partial listing of the IdentityComparer class:
internal class IdentityComparer : IComparer<Identity>
{
// Supported sort properties.
private static string[] _sortProperties;
private string _propertyToMatch;
// Index of the sort property in the array.
private int _sortPropertyIndex;
static IdentityComparer()
{
_sortProperties = new string[] { "Age",
"DateOfBirth",
"FirstName",
"Gender",
"Id",
"LastName",
"MiddleNames",
"Title" };
}
public IdentityComparer() : base()
{
_sortPropertyIndex = DEFAULT_SORTPROP_INDEX;
}
public IdentityComparer(PropertyDescriptor sortProperty) : this()
{
if (sortProperty != null)
{
_sortPropertyIndex = GetSortPropertyIndex(sortProperty);
}
}
private int CompareCore(Identity x, Identity y)
{
int comparison = 0;
switch (_sortPropertyIndex)
{
case 0:
{
comparison = CompareInt(x.Age, y.Age);
break;
}
case 1:
{
comparison = CompareDateTime(x.DateOfBirth, y.DateOfBirth);
break;
}
// Other sort property case statements
// ...
}
return comparison;
}
private int CompareDateTime(DateTime x, DateTime y)
{
return x.CompareTo(y);
}
private int CompareString(string x, string y)
{
return string.Compare(x, y, StringComparison.CurrentCultureIgnoreCase);
}
// Other type compare methods
// ...
private bool FindSortPropertyMatch(string propertyName)
{
// Compares the specified sort property array value
// with the PropertyDescriptor.Name
// ...
}
private int GetSortPropertyIndex(PropertyDescriptor sortProperty)
{
_propertyToMatch = sortProperty.Name;
int index = Array.FindIndex(_sortProperties, FindSortPropertyMatch);
if (index == -1)
{
// Sort property not found, so use the default...
index = DEFAULT_SORTPROP_INDEX;
}
return index;
}
int IComparer<Identity>.Compare(Identity x, Identity y)
{
int comparison = 0;
if (x == null)
{
if (y == null)
{
// Equal...
comparison = 0;
}
else
{
// y > x...
comparison = -1;
}
}
else
{
if (y == null)
{
// x > y...
comparison = 1;
}
else
{
comparison = CompareCore(x, y);
}
}
return comparison;
}
}
It is immediately evident from the above listing that the IdentityComparer class
is tied to a particular version of the Identity class, the _sortProperties array
field is populated with the names of the Identity class properties. The sort array
index of the Name of the PropertyDescriptor specified in the constructor is retrieved
and is then resolved in the CompareCore method to the relevant Identity class property
to use in the comparison. There are other, more maintenance friendly, ways of retrieving
the Identity class property names, we could, for example, use reflection or we could
just use the Identity class PropertyDescriptorCollection directly, however, ultimately
we would still have to resolve the property name to the actual property in the CompareCore
method, so the comparer would still be tied to a particular Identity class version.
The technique of explicitly naming the Identity properties in the static constructor
does, at least, make the dependency obvious.
Collection Class That Supports Sorting
To provide a collection with sorting and binary searching capabilities we have chosen
to extend the generic Collection<T> base class. The Collection<T> class is located in the
System.Collections.ObjectModel namespace and can be used as is i.e. it is a concrete base class or can be
extended to provide extra functionality. We can use the constructors of our derived collection class to
ensure that the Collection<T> base class uses an instance of the List<T> class as its internal
storage, the Collection<T> base class will wrap the List<T> instance that we specify and will
make it available to our derived collection class in the protected Items property. Using this technique
enables our derived collection class to use the sorting and searching methods provided by the List<T>
class rather than explicitly implementing its own sorting and searching functionality.
Partial listing of the IdentityCollection class:
public class IdentityCollection : Collection<Identity>
{
private PropertyDescriptor _sortProperty;
private ListSortDirection _sortDirection;
private bool _isSorted;
public IdentityCollection() : this(new List<Identity>())
{
}
public IdentityCollection(List<Identity> list) : base(list)
{
_sortDirection = ListSortDirection.Ascending;
}
public bool IsSorted
{
get { return _isSorted; }
}
public ListSortDirection SortDirection
{
get { return _sortDirection; }
}
public PropertyDescriptor SortProperty
{
get { return _sortProperty; }
}
private PropertyDescriptor GetDefaultSortProperty()
{
// Returns the PropertyDescriptor for the default
// sort property
// ...
}
private void ResolveId(Identity item)
{
// Ensures that Identity has a unique Id
// ...
}
protected override void ClearItems()
{
// Clears the collection
// ...
}
protected override void InsertItem(int index, Identity item)
{
// Inserts an Identity into the collection
// ...
}
protected override void RemoveItem(int index)
{
// Removes Identity from the collection
// ...
}
protected override void SetItem(int index, Identity item)
{
// Replaces one Identity with another
// ...
}
public int BinarySearch(Identity item)
{
return BinarySearch(0, this.Items.Count, item);
}
public int BinarySearch(int index, int count, Identity item)
{
if (item == null)
{
throw new ArgumentNullException();
}
if ((!_isSorted) ||
(_sortDirection != ListSortDirection.Ascending))
{
throw new InvalidOperationException(
"Collection is not sorted in ascending order.");
}
List<Identity> items = this.Items as List<Identity>;
return items.BinarySearch(index, count, item,
new IdentityComparer(_sortProperty));
}
public void Sort()
{
Sort(GetDefaultSortProperty(), ListSortDirection.Ascending);
}
public void Sort(ListSortDirection direction)
{
Sort(GetDefaultSortProperty(), direction);
}
public void Sort(PropertyDescriptor prop, ListSortDirection direction)
{
if ((prop != null) &&
(typeof(Identity).IsAssignableFrom(prop.ComponentType)))
{
if (!Enum.IsDefined(typeof(ListSortDirection), direction))
{
// Default to ascending...
direction = ListSortDirection.Ascending;
}
List<Identity> items = this.Items as List<Identity>;
if ((items != null) && (items.Count > 0))
{
items.Sort(new IdentityComparer(prop));
_sortProperty = prop;
_sortDirection = ListSortDirection.Ascending;
if (direction != _sortDirection)
{
// Reverse the sort direction...
items.Reverse();
_sortDirection = direction;
}
_isSorted = true;
}
}
}
}
The IdentityCollection class provides its own BinarySearch and Sort overloaded methods,
however, ultimately both methods delegate to the List<T> instance that is used internally
as storage. The IdentityCollection class constructors ensure that the Collection<T> base class
uses an instance of the List<T> class for internal storage. Both the BinarySearch and Sort methods
pass an instance of the IdentityComparer class to the List<T>
instance to enable the search and sort to be performed on the Identity property
specified in the PropertyDescriptor. The BinarySearch method uses the current sort
property by passing the _sortProperty field to the IdentityComparer constructor
because the binary search will only work as expected if the internal list is already
sorted on the property that is being searched. The Sort method uses either the default
sort property, by calling the GetDefaultSortProperty method or the property in the
specified PropertyDescriptor, depending on which overload is called.
Data Binding the Collection Class
As stated earlier in the article, if we were creating a collection class that is
intended primarily to be used by user interface components or controls, the recommend
option would be to develop a collection that provides an implementation of the IBindingList
interface. The easiest way of providing an IBindingList implementation is by deriving
your own collection class from the generic BindingList<T> class which is contained in the
System.ComponentModel namespace. However, this option is overkill in a lot of situations and it
is certainly easier just to add the functionality you require to a class derived from Collection<T>
than it is to implement all of the requirements of the IBindingList interface even though the
BindingList<T> class does ease the burden. The IdentityCollection class that we have developed
here may be intended primarily to be used from code but it can still be used by
application developers and it can still be data bound to a user interface component
or control. Application develops can also make use of the sorting functionality
provided by the IdentityCollection class, they will just have to sort the collection
from code, rather than have the user interface component or control sort the collection
automatically.
To data bind the IdentityCollection to the DataGridView control:
- Add a
DataGridView control to a Form and position, size and anchor it as required.
- Add a
BindingSource component to the Form.
- Set the
BindingSource.DataSource property to the IdentityCollection class. You will
have to add the collection as a project data source as part of the setting process.
- Set the
DataGridView.DataSource property to the BindingSource component.
- In the
DataGridView properties window, click the ellipsis button in the Columns
property setting to display the edit column collection dialog. In the dialog, arrange
the Identity class property columns in the required order and set the SortMode of
each column to Programmatic.
- In the
DataGridView properties window, set the MultiSelect property to False.
The rest of the data binding and sorting has to be achieved in code within the Form.
Create a Form level field of type IdentityCollection and a field of type
DataGridViewColumn. Create a helper method to populate the IdentityCollection,
in this example the method is named LoadCollection. Define and connect a handler to the
Form.Load event.
In the Form and Form.Load event handler, place the following code:
private IdentityCollection _identities;
private DataGridViewColumn _sortColumn;
private void Form1_Load(object sender, EventArgs e)
{
LoadCollection();
bindingSource1.DataSource = _identities;
}
At this point if you run the application you will see the contents of the IdentityCollection
displayed in the DataGridView control, however, you won't be able to sort the collection.
A bit more code is required to perform sorting. Define and connect a handler to
the DataGridView.ColumnHeaderMouseClick event.
In the DataGridView.ColumnHeaderMouseClick event handler,
place the following code:
private void dataGridView1_ColumnHeaderMouseClick(object sender,
DataGridViewCellMouseEventArgs e)
{
if (e.Button == MouseButtons.Left)
{
DataGridViewColumn newColumn = dataGridView1.Columns[e.ColumnIndex];
if (_sortColumn != null)
{
// Clear the sort image from the previous sort column header...
_sortColumn.HeaderCell.SortGlyphDirection = SortOrder.None;
}
ListSortDirection direction = _identities.SortDirection;
if (_sortColumn == newColumn)
{
// Toggle the sort direction...
if (direction == ListSortDirection.Ascending)
{
direction = ListSortDirection.Descending;
}
else
{
direction = ListSortDirection.Ascending;
}
}
else
{
direction = ListSortDirection.Ascending;
}
PropertyDescriptorCollection properties =
TypeDescriptor.GetProperties(typeof(Identity));
PropertyDescriptor prop =
properties.Find(newColumn.DataPropertyName, false);
_identities.Sort(prop, direction);
bindingSource1.ResetBindings(false);
if (direction == ListSortDirection.Ascending)
{
newColumn.HeaderCell.SortGlyphDirection = SortOrder.Ascending;
}
else
{
newColumn.HeaderCell.SortGlyphDirection = SortOrder.Descending;
}
_sortColumn = newColumn;
}
}
If you run the application now you will see the contents of the IdentityCollection
in the DataGridView control and you will be able to sort the grid by left clicking
any of the grid header cells. Clicking the current sort header cell again will reverse
the sort order. In addition to the sorting, demonstrated above, if you have set
the DataGridView properties appropriately, you will be able to edit an Identity
and add an Identity to or remove an Identity from the IdentityCollection.
The updating of the DataGridView works because the BindingSource component implements
its own ListChanged event, so even though our collection does not implement IBindingList,
which defines the ListChanged event, the BindingSource component is smart enough
to raise its own event when we call the BindingSource.ResetBindings method. Incidentally,
you might be tempted to call the BindingSource.ApplySort method, rather than the
IdentityCollection.Sort method, however the call will fail because the BindingSource
component only supports the ApplySort method for collections that implement the
IBindingList interface.
Conclusion
Adding a sort and binary search capability to a collection class is beneficial in
many situations and it is certainly fairly straight forward to take advantage of
the sorting and searching functionality already present in some of the .Net Framework
classes. If you need smarter or perhaps faster sorting then you can always use your
own algorithm, however, in many situations the QuickSort algorithm provided in the
Array class is adequate and you do have the opportunity to fine tune your comparer
to get the desired results. Whether you take the minimalist approach described in
this article or go with a full-blown IBindingList implementation depends on what
you are trying to achieve with the collection class and to a certain extent how
much time you have!
Copyright ©Blayd Software Limited 2008-2010. All rights reserved.