Data Science From Scratch To Production MVP Style: Model

Modeling Now this section is of least important so we’re going to be incredibly sloppy here. We’ll perform a simple train test split and create a simple Linear Regression model. X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20) model = LinearRegression().fit(X_train, y_train) model.score(X_test, y_test) 0.498875410976377 model.coef_ array([635.1964214]) model.intercept_ 5810.549007860056 pyplot.scatter(X_test, y_test, color = 'red') pyplot.plot(X_train, model.predict(X_train), color = 'blue') pyplot.title('temperature_fahrenheight vs ice_cream_sales_usd (Test set)') pyplot.xlabel('temperature_fahrenheight') pyplot.ylabel('ice_cream_sales_usd') pyplot.show() While our model isn’t great, lets pretend we’re satisfied with it and move on to preparing to wrap our model in an API and getting it into production....

April 9, 2020 · 1 min · Greg Hilston

Data Science From Scratch To Production MVP Style: Pipeline

Goal Of This “Data Science From Scratch To Production MVP Style” Series? The goal of this series is to walk through four steps: Create a pipeline to process raw data Create a simple model to make predictions on future raw data Wrap the pipeline and model in a simple API Deploy the API to production, using AWS Lambda Each step above will have its own blog post. This is the first of the four....

April 8, 2020 · 6 min · Greg Hilston