Making your data science project more reliable, testable, and deployable

Photo by Jon Tyson on Unsplash

Introduction

In this post, we will learn some best practices to improve our code quality and reliability for the production Data Science code.

Note: Most of the things mentioned here are not new to the Software engineering world, but they often get ignored/missed in the experimental world of Data Science.

Here in this post, I will briefly mention the topics and things we can do to make our project more reliable and I will create a few follow-up posts to describe each of these steps in more detail using a project example. …


Simple instructions on deploying your Streamlit app on Heroku cloud platform

Photo by Stephen Dawson on Unsplash

Introduction

If you want to deploy an interactive dashboard or your portfolio as a web page in a cloud platform, Heroku is a great app to deploy your dashboard. In our previous post, we talk about how to build Interactive dashboards in Python using Streamlit. Here we will deploy our streamlit app in a cloud platform.

Heroku

Heroku is a cloud platform that runs our apps in virtual containers (Dynos)which executes on a runtime environment. These containers support many different languages like Ruby, Scala, Python, Java, etc. It also provides custom buildpacks with which we can deploy apps in any other language.

Setup Heroku


Keep an eye on your AWS costs and don’t run out of credits or $$$

Photo by NeONBRAND on Unsplash

If you are a startup you would probably want to keep a very close eye on your costs, further if you’re a student and are working with your available credits to get you through your Uni projects, Cost is going to be a big big concern for you.

This is going to be a very quick post on getting some visibility over our AWS expenses and resources.

AWS cost monitor

Visit AWS cost monitor . Launch the Cost explorer and click on Reports. Create a New report.


Photo by Romson Preechawit on Unsplash

Cause of error

This error is caused because mismatch in versions of tensorflow-gpu and CUDA. Every tensorflow-gpu lib is dependent on a very specific CUDA version.

Check our versions

Check tensorflow-gpu version :

pip list | grep tensorflow-gpu

Our tensorflow-gpu version is 1.8.0.

Check CUDA version:

ls -l /usr/local/cuda

Our cuda version is cuda-8.0.

Investigate issue

What are the compatible cuda versions for tensorflow :

Let’s refer to official TensorFlow page for the version compatibility.


Stop being limited by the local system resource and move your deep learning workloads to cloud GPU

Photo by Zii Miller on Unsplash

Get GPU from AWS

Let’s create a GPU instance for our Deep Learning workloads. We need an AWS EC2 instance for this. Login to AWS web console and lookup for the EC2 service and click Launch Instance.


Level up your data science projects with interactive dashboards

If you are working on a visualisation project and want to demo your findings, or if you are hitting the job market and want to create some portfolio projects — interactive dashboards are a great way to provide the information in a good accessible way.

Streamlit

We will be using Python and Streamlit for our interactive dashboards today.

Streamlit’s is an open-source app framework which makes the life easy for data scientist by exploring and understanding data via a beautiful dashboard.

Setting up Streamlit

Let’s first Install Streamlit to our system and then run the hello command just to verify that everything is working…


Image courtesy : Google

Introduction

In this post, we will learn how we can create a simple neural network to extract information ( NER) from unstructured text data with Keras.

Named Entity Recognition (NER)

NER is also known as entity identification or entity extraction. It is a process of identifying predefined entities present in a text such as person name, organisation, location, etc. It is a statistical model which is trained on a labelled data set and then used for extracting information from a given set of data.

Sometimes we want to extract the information based on our domain or industry. For example : in medical domain, we want…


Photo by Jubal Kenneth Bernal on Unsplash

Named Entity Recognition (NER)

NER is also known as entity identification or entity extraction. It is a process of identifying predefined entities present in a text such as person name, organisation, location, etc. It is a statistical model which is trained on a labelled data set and then used for extracting information from a given set of data.

Sometimes we want to extract the information based on our domain or industry. For example : in medical domain, we want to extract disease or symptom or medication etc, in that case we need to create our own custom NER.

Spacy

It is an open source software…


Applying Deep Learning on text corpuses for Job Skills extraction

Link to white paper

I was recently researching various text mining and language processing techniques to extract Job Skills from Job postings and Resume data. The input data is a free text corpus and the expected output would be the desired skills sets for a given job profile.

I decided to document all my research as a paper with all the technical details that might be useful for someone researching a similar problem. So here it is -

Direct link to paper : https://confusedcoders.com/wp-content/uploads/2019/09/Job-Skills-extraction-with-LSTM-and-Word-Embeddings-Nikita-Sharma.pdf

The output of the exercise were very promising and I was able to extend the model to various Job…


Photo by Clint Adair on Unsplash

In our previous post, we discussed about getting started with knowledge graph where we saw how to install neo4j in Docker and Modelling tabular data as graph where we saw how we can push the data to our graphical database.

Now that we have all the data in our graph database, we can analyse it directly in Neo4j. However a lot of time we need to access small part of the data via apps and we would need to build an API layer on the graph DB and expose only the relevant information.

In this post, we will discuss about…

Nikita sharma

Data Scientist | Python programmer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store