Data Science From Scratch To Production MVP Style: API

Serializing

If you’re not familiar with serialization and deserialization, “serialization” is the basic concept is taking an object in memory and converting it into a state that can be written to a file or sent over the network. While “deserialization” is the act of reversing that process to turn the file back into a object that can be used in code again.

We’re leverage this so we can package up our model in a way that can be deployed along side our API and not married to it, allowing us to separate the notion of the API and the model.

with open("api/artifact/pipe.dill", 'wb') as f:
    dill.dump(pipe, f)

with open("api/artifact/model.dill", 'wb') as f:
    dill.dump(model, f)

API

Now we’re ready to write a small generic API that can be passed a pipeline and model to be interacted with over HTTP using REST.

import dill
import requests
import pandas as pd
from flask import Flask, request, jsonify


app = Flask(__name__)

with open("api/artifact/pipe.dill", 'wb') as f:
    dill.dump(pipe, f)

with open("api/artifact/model.dill", 'wb') as f:
    dill.dump(model, f)

@app.route("/predict", methods=["post"])
def predict():
    raw_json = request.get_json(force=True)
    flat_table_df = pd.json_normalize(raw_json)
    processed = pipe.transform(flat_table_df)
    return str(model.predict(processed)[0])

One thing to point out is our API only deserializes the pipe.dill and model.dill upon launch. This is a benefit as it will be faster to respond to requests after initial boot but must be restarted if a newer pipe.dill or model.dill file are provided.

The above code creates a single endpoint that can be interacted with over HTTP REST, making a GTE request with a JSON body. An example of that using curl would look like:

!curl --request POST -H "Content-Type: application/json" --data '{"temperature_celsius": 5.004}' "127.0.0.1:5000/predict"

Serializing#

API#

Serializing

API