[SOLVED] Sylvan CSV Reader C# Check for Missing Column in CSV

Issue

@MarkPflug I have a requirement to read 12 columns out of 45 – 85 total columns. This is from multiple csv files (in the hundreds). But here is the problem, a lot of the times a column or two will be missing from some csv data files. How do I check in C# for a missing column in a csv file given I use the nuget package sylvan csv reader. Here is some code:

// Create a reader
CsvDataReader reader = CsvDataReader.Create(file, new CsvDataReaderOptions { ResultSetMode = ResultSetMode.MultiResult });
// Get column by name from csv. This is where the error occurs only in the files that have missing columns. I store these and then use them in a GetString(Ordinal).
reader.GetOrdinal("HomeTeam");
reader.GetOrdinal("AwayTeam");
reader.GetOrdinal("Referee");
reader.GetOrdinal("FTHG");
reader.GetOrdinal("FTAG");
reader.GetOrdinal("Division");
// There is more data here, but anyway you get the point.
// Here I run the reader and for each piece of data I run my database write method.
    while (await reader.ReadAsync())
    {
        await AddEntry(idCounter.ToString(), idCounter.ToString(), attendance, referee, division, date, home_team, away_team, fthg, ftag, hthg, htag, ftr, htr);
    }

I tried the following:

// This still causes it to go out of bounds.
if(reader.GetOrdinal("Division") < reader.FieldCount)
    // only if the ordinal exists then assign it in a temp variable
else
    // skip this column (set the data in add entry method to "")

Solution

Looking at the source, it appears that GetOrdinal throws if the column name isn’t found or is ambiguous. As such I expect you could do:

int blah1Ord = -1;
try{ blah1Ord = reader.GetOrdinal("blah1"); } catch { }

int blah2Ord = -1;
try{ blah2Ord = reader.GetOrdinal("blah2"); } catch { }

while (await reader.ReadAsync())
{
    var x = new Whatever();
    if(blah1Ord > -1) x.Blah1 = reader.GetString(blah1Ord);
    if(blah2Ord > -1) x.Blah2 = reader.GetString(blah2Ord);
}

And so on, so you effectively sound out whether a column exists – the ordinal remains -1 if it doesn’t – and then use that to decide whether to read the column or not

Incidentally, I’ve been dealing with CSVs with poor/misspelled/partial header names, and I’ve found myself getting the column schema and searching it for partials, like:

using var cdr = CsvDataReader.Create(sr);

var cs = await cdr.GetColumnSchemaAsync();
var sc = StringComparison.OrdinalIgnoreCase;
var blah1Ord = cs.FirstOrDefault(c => c.ColumnName.Contains("blah1", sc))?.ColumnOrdinal ?? -1;

Answered By – Caius Jard

Answer Checked By – Robin (BugsFixing Admin)

Leave a Reply

Your email address will not be published. Required fields are marked *