In SQL I have an
EanTbl (EAN varchar(13) (PK), ProductID int) with millions of EAN barcode numbers as string. Many of which are in reality the same but written differently, i.e. with one or more leading zeros. One ProductID can be assigned to many EAN Barcodes numbers. (historically grown, I would have created the EAN column as Long from the beginning)
From now on, when a EAN barcode is inserted, I want to avoid the creation of additional doublettes by checking if an already existing EAN matches the new one.
EAN | ProductID ========================== 123456789| 987 \ 0123456789| 987 - In reality the same EAN 00123456789| 987 /
Now, when a user Input wants me to insert a new value of for instance
000123456789 I now want the most efficient query possible in T-SQL to discover that there are already matching EAN Barcodes assigned to this ProductID.
I already tried the different flavours of CONVERT() but as you can imagine the performance is horrible. This check needs to happens several hundreds times a minute on a table with 50+ millions of rows.
Thanks in advance for the suggestions.
Thanks to the suggestions. In the Long term I will reorganize the table as most of you suggested but for the moment I found the fastest way to do what I want is something like that:
SELECT EanCode, ProductID FROM EanCodes WHERE EanCode = @ean OR EanCode = '0'[email protected] OR EanCode = '00'[email protected] OR EanCode = '000'[email protected] OR EanCode = '0000'[email protected] OR EanCode = '00000'[email protected] OR EanCode = '000000'[email protected]
@ean being the string parameter of course
It seemed counter-intuitive at first but it is indeed the fastest way.
Answered By – KarmaEDV
Answer Checked By – Katrina (BugsFixing Volunteer)