[SOLVED] How to speed up python and sqlalchemy?

Issue

The model in my source code is in the format below.
Array in dict Array in dict Array in dict…

# data structure
user_list = [{user_name: 'A', 
  email: '[email protected]', 
  items:[{name:'a_itme1', properties:[{1....},{2....}...]}
 ]} * 100]

I’m trying to put the above data into a postgresql db with SQLAlchemy.
There is a user table, an entity table, and an attribute table.
And there are tables that link users and items, and items and properties respectively.

for u in user_list:
  new_user = User(user_name=u.get('user_name'),....)
  session.add(new_user)
  session.flush()
  for item in u.get('items'):
    new_item = Item(name=item.get('name'),.....)
    session.add(new_item)
    session.flush()
    new_item_link = UserItemLink(user_id=new_user.id, item_id=new_item.id,...)
    session.add(new_item_link)
    session.flush()
    for prop in item.properties:
      new_properties = Properties(name=prop.get('name'),...)
      session.add(new_properties)
      session.flush()
      new_prop_link = ItemPropLink(item_id=new_item.id, prop_id=new_properties.id,...)
      session.add(new_prop_link)
      session.flush()
session.commit()

My models look like this:

class User(Base):
    __tablename__ = 'user'

    id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
    name = Column(String(20))
    email = Column(String(50))

    user_item_link = relationship('UserItemLink', back_populates='user')

class Item(Base):
    __tablename__ = 'item'

    id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
    name = Column(String(50))
    note = Column(String(50))

    user_item_link = relationship('UserItemLink', back_populates='item')

class Properties(Base):
    __tablename__ = 'properties'

    id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
    name = Column(String(50))
    value = Column(String(50))

    item_prop_link = relationship('ItemPropLink', back_populates='properties')

class UserItemLink(Base):
    __tablename__ = 'user_item_link'

    id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
    user_id = Column(ForeignKey('db.user.id'), nullable=False)
    item_id = Column(ForeignKey('db.item.id'), nullable=False)

The above sources have been simplified for better understanding.
When session.add() is performed sequentially with the above information, it takes a lot of time.
When 100 user information is input, there is a delay of 8 seconds or more.

Please advise to improve python speed and sqlalchemy speed.

Solution

As you have relationships configured on the models you can compose complex objects using these relationships instead of relying on ids:

with Session.begin() as s:
    for u in user_list:
        user_item_links = []
        for item in u.get('items'):
            item_prop_links = []
            for prop in item['properties']:
                item_prop_link = ItemPropLink()
                item_prop_link.properties = Properties(name=prop.get('name'), value=prop.get('value'))
                item_prop_links.append(item_prop_link)
            item = Item(name=item.get('name'), item_prop_link=item_prop_links)
            user_item_link = UserItemLink()
            user_item_link.item = item
            user_item_links.append(user_item_link)
        new_user = User(name=u.get('user_name'), email=u.get('email'), user_item_link=user_item_links)
        s.add(new_user)

SQLAlchemy will automatically set the foreign keys when the session is flushed at commit time, removing the need to manually flush.

Answered By – snakecharmerb

Answer Checked By – Candace Johnson (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *