Copy data from Mysql to Postgres using SqlAlchemy

As I mentioned earlier I had somehow to move my data from Mysql to Postgres. There is a Wiki full of documentation on how to do that. But before falling back to a finished solution I wanted to get my hands dirty by utilizing sqlalchemy.

At first I wanted dump (serialize/de-serialize) the sqlalchemy objects using pickle. Basically something like this:

import pickle
output = open('users.pkl', 'wb')
users = Session.query(User).all()
pickle.dump(users, output)
output.close()

But after consulting the sqlalchemy documentation I figured there was a better way to accomplish my goal.

src = create_engine('mysql+oursql://username:password@127.0.0.1:3306/dbname')
dst = create_engine('postgresql://username:password@localhost:5432/dbname')

tables = Base.metadata.tables;
for tbl in tables:
    print ('##################################')
    print (tbl)
    print ( tables[tbl].select())
    data = src.execute(tables[tbl].select()).fetchall()
    for a in data: print(a)
    if data:
        print (tables[tbl].insert())
        dst.execute( tables[tbl].insert(), data)

First I create two connections. One pointing to the source database (mysql). The other pointing to the destination (postgres). Both databases are by default only accessible from localhost, but thanks to openssh’s port forwarding that wasn’t an issue.

ssh -L 5432:localhost:5432 someuser@hostA
ssh -L 3306:127.0.0.1:3306 someuser@hostB

Important: Mysql will try to connect using the Unix socket instead of TCP/IP if you use localhost. So it would ignore your shiny port forwarding if you try to connect to localhost instead of 127.0.0.1.

Base is my sqlalchemy MetaData object Using these metadata I iterate over all the table names. Generate and execute select statements. Store the results in the data variable and finally execute the insert statement using those results.

Tuesday, November 1, 2011 » python sqlalchemy mysql postgres