How to build deep neural network for custom NER with Keras

Nikita sharma
5 min readDec 29, 2019
Image courtesy : Google

Introduction

In this post, we will learn how we can create a simple neural network to extract information ( NER) from unstructured text data with Keras.

Named Entity Recognition (NER)

NER is also known as entity identification or entity extraction. It is a process of identifying predefined entities present in a text such as person name, organisation, location, etc. It is a statistical model which is trained on a labelled data set and then used for extracting information from a given set of data.

Sometimes we want to extract the information based on our domain or industry. For example : in medical domain, we want to extract disease or symptom or medication etc, in that case we need to create our own custom NER.

Model Architecture

Here we will use BILSTM + CRF layers. The LSTM layer is used to filter the unwanted information and will keep only the important features/information and the CRF layer is used to deal with the sequential data.

BI-LSTM Layer

BI-LSTM is used to produce vector representation for our words. It takes each word in a sentence as an input and produce a vector representation of each word in both directions (i.e; forward and backward) where forward direction access past information and backward direction access future. It is then combined with the CRF layer

CRF Layer

CRF layer is an optimisation on top of BI-LSTM layer. It can be used to efficiently predict the current tag based on the past attributed tags. Here is a great post on why CRF layer is useful on top of BI-LSTM

Data Preprocessing

Data Format

For this example I have used this Kaggle dataset. For our model, we need a data frame that contain ‘Sentence_id’/ ‘Sentence’ column, ‘word’ column and the ‘tag’ column.

Nikita sharma

Data Scientist | Python programmer