[SOLVED] Data structure to store sorted items based on a cost field

Issue

I need to increase performance on an algorithm that for every iteration adds up to 4 items to a list and then takes out 1 item with the lowest cost (property on the item). The list is therefor increasing every iteration until the algorithm breaks the loop. Several items can have the same cost and I will just take one of them items in that case.

What data structure should I use?

I am thinking of creating a list where I intend to store the items sorted. Then for every iteration use some kind of sorting algorithm (will research the fastest) and insert the new items at the right position in the list. Then Remove the first item from the list.

Solution

.NET 6 update: a PriorityQueue<TElement, TPriority> class was introduced with .NET 6, which makes the implementation below mostly irrelevant.


Here is a basic PriorityQueue class that you could use. It is based on a SortedSet<(TSource, TPriority, long)>, with the long having the purpose of differentiating the items with equal priority.

public class PriorityQueue<TSource, TPriority>
{
    private readonly SortedSet<(TSource, TPriority, long)> _set;
    private long _index;

    public PriorityQueue(IComparer<TPriority> priorityComparer = null)
    {
        priorityComparer ??= Comparer<TPriority>.Default;
        var comparer = Comparer<(TSource, TPriority, long)>.Create((x, y) =>
        {
            int result = priorityComparer.Compare(x.Item2, y.Item2);
            if (result == 0) result = Comparer<long>.Default.Compare(x.Item3, y.Item3);
            return result;
        });
        _set = new SortedSet<(TSource, TPriority, long)>(comparer);
    }

    public void Enqueue(TSource item, TPriority priority)
    {
        _set.Add((item, priority, _index++));
    }

    public bool TryDequeue(out TSource item)
    {
        if (_set.Count == 0) { item = default; return false; }
        var min = _set.Min;
        _set.Remove(min);
        item = min.Item1;
        return true;
    }
}

The priority in your case is the cost of each item. Items with lower cost are dequeued first. Smaller TPriority values denote higher priority.

Usage example:

var queue = new PriorityQueue<Item, Decimal>();

//...

queue.Enqueue(item1, item1.Price);
queue.Enqueue(item2, item2.Price);

//...

if (queue.TryDequeue(out var bestPriceItem))
    Console.WriteLine(bestPriceItem);
else
    Console.WriteLine("The warehouse is empty.");

The implementation above should be quite efficient with large numbers of items. The SortedSet<T> is implemented internally as a binary search tree, with the basic operations having a decent O(log n) complexity. But it’s not the optimal solution. If you want the best of the best, you could take a look at what may become the official Microsoft’s implementation, which is now in the "api-approved" stage (source code). That one is based on a heap instead of a binary search tree, with the same O(log n) complexity, but smaller memory footprint and less constant overhead per operation.

Answered By – Theodor Zoulias

Answer Checked By – Candace Johnson (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *