C# Technique: Extracting Table and Field Names from SQL Query

Hey everyone, I’m working on a C# project where I need to parse a SQL query string. I’m trying to figure out how to get the table names and field names from the query. Here’s what I’m aiming for:

string sqlQuery = @"SELECT Products.Name, Products.Price, Orders.Date 
                     FROM Products 
                     JOIN Orders ON Products.ID = Orders.ProductID";

// What I want:
List<string> fields = new List<string>
{
    "Products.Name", "Products.Price", "Orders.Date"
};

List<string> tables = new List<string>
{
    "Products", "Orders"
};

I’ve thought about using string manipulation, but it feels clunky. Is there a better way to do this in C#? Maybe a library or technique I’m not aware of?

I’d really appreciate any suggestions or code examples. Thanks!

While regex and string manipulation can work for simple queries, they often fall short for complex SQL statements. Have you considered using a SQL parsing library like Microsoft.SqlServer.TransactSql.ScriptDom? It’s part of the SQL Server Management Objects (SMO) framework and provides robust parsing capabilities.

Here’s a basic example of how you might use it:

using Microsoft.SqlServer.TransactSql.ScriptDom;

var parser = new TSql150Parser(false);
IList<ParseError> errors;
var fragment = parser.Parse(new StringReader(sqlQuery), out errors);

var visitor = new CustomSqlVisitor();
fragment.Accept(visitor);

// visitor.Tables and visitor.Fields would contain your extracted data

You’d need to implement a custom visitor class to traverse the parsed SQL and extract the information you need. This approach is more reliable and can handle a wide range of SQL syntax.

hey alex, have you considered using a sql parser library? i’ve heard good things about ANTLR for this kinda stuff. it might be overkill for simple queries, but could be super helpful if youre dealing with more complex ones. what types of queries are you typically working with?

have u tried using regex? it could work for simpler queries. smthing like this might help:

var fieldRegex = new Regex(@"SELECT\s+(.*?)\s+FROM");
var tableRegex = new Regex(@"FROM\s+(.*?)(?:\s+WHERE|$)");

just be careful with complex queries, regex might not catch everything