I find myself using Pandas with almost ever python data-esque project. Additionally, the databases I most often interact with are relational, specifically SQL. These basic instructions will allow you to query a SQL database and get the results as a Pandas data frame.

$ sudo pip3 install pandas sqlalchemy mysqlclient

I’ve also used flask_sqlalchemy when writing an API.

$ sudo pip install sqlalchemy

For ubuntu, install the g++, sqlclient and make sure you get python3.7 and python3.7-dev

$ sudo apt-get install g++ default-libmysqlclient-dev python3.7-dev

Which you would do in addition to above. Your import could look like this:

from sqlalchemy import create_engine, Table, MetaData, insert, Column, Integer, Enum

If you’re using flask you’d want to use

from flask_sqlalchemy import SQLAlchemy

You can read more here

Now you are able to write a SQL statement against a SQL server and get the results as a Pandas DataFrame:

engine = create_engine(f"mysql://{user}:{password'}@{host}/{database}", echo=False)

Where you’ll replace:

  • user
  • password
  • host
  • database

With your own information. And now we can use our engine to run a SQL query!

Here’s any example of using a SQL query to fetch data from our SQL DB and store the result in a DataFrame:

read_query = "SELECT * FROM foo

read_df = pd.read_sql(read_query, con=engine)

Now read_df has the result of the read_query!

Here’s any example of using a SQL query to write data to our SQL DB from a DataFrame:

data = {'row_1': [3, 2, 1, 0], 'row_2': ['a', 'b', 'c', 'd']}

write_df = pd.DataFrame.from_dict(data)

write_df.to_sql(name="sql-table-name", con=engine, schema="schema-name")

As always, check the official docs for more information, but this is the bare bones needed to read and write to a SQL DB using Pandas!