[SOLVED] Is there any way to make Search and addToSearch faster?

Issue

Is there any way to make Search and addToSearch faster?
I am trying to make it faster. I am not sure if regex in addtosearch can be a problem, it is really small. I am out ofideas how to optimize it further. Now i am just trying to meet word count. I wonder if there is a way to concatenate parts of name that are not empty more effectivly than i do.

using System.Collections.Generic;
using System.Text.RegularExpressions;
using System;

namespace AutoComplete
{
    public struct FullName
    {
        public string Name;
        public string Surname;
        public string Patronymic;
    }

    public class AutoCompleter
    {
        private List<string> listOfNames = new List<string>();
        private static readonly Regex sWhitespace = new Regex(@"\s+");
        public void AddToSearch(List<FullName> fullNames)
        {
            foreach (FullName i in fullNames)
            {
                string nameToAdd = "";
                if (!string.IsNullOrWhiteSpace(i.Surname))
                {
                    nameToAdd += sWhitespace.Replace(i.Surname, "") + " ";
                }
                if (!string.IsNullOrWhiteSpace(i.Name))
                {
                    nameToAdd += sWhitespace.Replace(i.Name, "") + " ";
                }
                if (!string.IsNullOrWhiteSpace(i.Patronymic))
                {
                    nameToAdd += sWhitespace.Replace(i.Patronymic, "") + " ";
                }
                listOfNames.Add(nameToAdd.Substring(0, nameToAdd.Length - 1));
            }
        }

        public List<string> Search(string prefix)
        {
            if (prefix.Length > 100 || string.IsNullOrWhiteSpace(prefix))
            {
                throw new System.Exception();
            }
            List<string> namesWithPrefix = new List<string>();
            foreach (string name in listOfNames)
            {
                if (IsPrefix(prefix, name))
                {
                    namesWithPrefix.Add(name);
                }
            }
            return namesWithPrefix;
        }
        private bool IsPrefix(string prefix, string stringToSearch)
        {
            if (stringToSearch.Length < prefix.Length)
            {
                return false;
            }
            for (int i = 0; i < prefix.Length; i++)
            {
                if (prefix[i] != stringToSearch[i])
                {
                    return false
                }
            }
            return true
        }
    }
}

Solution

Regular expression (Regexp) are great because of their ease-of use and flexibility but most Regexp engines are actually quite slow. This is the case for the one of C#. Moreover, strings can contain Unicode character and "\s" needs to consider all the (fancy) spaces characters included in the Unicode character set. This make Regexp search/replace much slower. If you know your input does not contain such characters (eg. ASCII), then you can write a much faster implementation. Alternatively, you can play with RegexpOptions like Compiled and CultureInvariant so to reduce a bit the run time.

The AddToSearch performs many hidden allocations. Indeed, += create a new string (because C# string are immutable and not designed to be often resized) and Replace calls does allocate new strings too. You can speed up the computation by directly replace and write the result in a preallocated buffer and simply copy the result with a Substring like you currently do.

Search is fine and it is not easy to optimize it. That being said, if listOfNames is big, then you can use multiple threads so to significantly speed up the computation. Be careful though because Add is not thread-safe. Parallel linkq may help you to do that easily (I never tested it though).

Another solution to speed up a bit the computation of Search is to start the loop of IsPrefix from prefix.Length-1. Indeed, if most string contains the beginning of the prefix, then a significant portion of the time will be spend comparing nearly equal characters. The probability that prefix[prefix.Length-1] != stringToSearch[prefix.Length-1] is higher than prefix[0] != stringToSearch[0]. Additionally, partial loop unrolling may help a bit to speed up the function if the JIT is not able to do that.

Answered By – Jérôme Richard

Answer Checked By – Jay B. (BugsFixing Admin)

Leave a Reply

Your email address will not be published. Required fields are marked *