[SOLVED] Why does Connection.exec_driver_sql have inconsistent parameter parsing for queries with the IN operator?

Issue

These two queries are semantically identical, but one of them succeeds and the other one fails. The only difference is in the WHERE clause, where the two operands of the OR operator have been switched.

from sqlalchemy import create_engine
engine = create_engine('mysql+pymysql://user:[email protected]:port/db', 

# Query succeeds (and does what's expected)
with engine.connect() as cn:
    cn.exec_driver_sql(
        f'UPDATE table SET column = "value" WHERE id = %s OR id IN %s', 
        (3, (1, 2))
    )

# Query fails
with engine.connect() as cn:
    cn.exec_driver_sql(
        f'UPDATE table SET column = "value" WHERE id IN %s OR id = %s', 
        ((1, 2), 3)
    )

Output of the failed query:

TypeError: ‘int’ object is not iterable

It seems that sqlalchemy argument parsing is dependant on the type of the first argument. If the first one if a tuple, the query fails, if it is a int/float/str, it succeeds.

The workaround I’ve found so far is to use named arguments:

# Query succeeds
with engine.connect() as cn:
    cn.exec_driver_sql(
        f'UPDATE table SET column = "value" WHERE id IN %(arg1)s OR id = %(arg2)s', 
        {'arg1': (1, 2), 'arg2': 3}
    )

However it is more verbose and I don’t want to use this everywhere. Also note that PyMySQL cursor’s execute method accepted both queries.

Is there a reason for this behaviour?

Solution

I think the problem lies in DefaultExecutionContext._init_statement classmethod.

...
        if not parameters:
            ...
        elif isinstance(parameters[0], dialect.execute_sequence_format):
            self.parameters = parameters
        elif isinstance(parameters[0], dict):
            ...
        else:
            self.parameters = [
                dialect.execute_sequence_format(p) for p in parameters
            ]

        self.executemany = len(parameters) > 1

isinstance(parameters[0], dialect.execute_sequence_format) is checking whether the first element of parameters is a tuple. This seems to be heuristic for efficiently detecting an executemany scenario: probably it should check that all the elements are tuples and of equal length*. As it is values ((1, 2), 3) will cause the equivalent of

cursor.executemany(sql, [(1, 2), 3])

and syntactically invalid statements like

SELECT * FROM tbl WHERE id IN 1 OR id = 2
--                         ^^^^

Wrapping the parameters in a list fixes the problem, since len(parameters) will no longer be greater than one.

with engine.connect() as cn:
    cn.exec_driver_sql(
        f'UPDATE table SET column = "value" WHERE id IN %s OR id = %s', 
        [((1, 2), 3)]
    )

Obviously this is a workaround on top of a heuristic, so it may not work in every possible situation. It’s probably worth opening a discussion on GitHub to explore whether this is a bug that should be fixed.


* Passing a tuple to create parenthesised values for ...IN %s or ...VALUES %s is not supported by all drivers, so it’s not that bad a heuristic.

Answered By – snakecharmerb

Answer Checked By – Clifford M. (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *