AutoCorrect / Spell Check using Deep Learning in Python!

Introduction

Data Preprocessing

!curl -O http://www.manythings.org/anki/fra-eng.zip
!unzip fra-eng.zip
np.random.choice(np.arange(0, 2), p=[0.1, 0.9]))
    input                      target
uom misled you tom misled you\n
tou uisled you tom misled you\n
tomvmisledvyou tom misled you\n
tgm misled ygu tom misled you\n
tom misled you tom misled you\n
Input Token Index:{' ': 0,
'a': 1,
'b': 2,
'c': 3,
'd': 4,
'e': 5,
'f': 6,
'g': 7,
'h': 8,
'i': 9,
'j': 10,
'k': 11,
'l': 12,
'm': 13,
'n': 14,
'o': 15,
'p': 16,
'q': 17,
'r': 18,
's': 19,
't': 20,
'u': 21,
'v': 22,
'w': 23,
'x': 24,
'y': 25,
'z': 26}

Model

Model Architecture

Configurations: (You can play with this depending on your compute)batch_size = 64  # Batch size for training.
epochs = 100 # Number of epochs to train for.
latent_dim = 128 # Latent dimensionality of the encoding space.
output_dim = 64

Inference

# Reverse-lookup token index to decode sequences back to
# something readable.
reverse_input_char_index = dict((i, char) for char, i in input_token_index.items())reverse_target_char_index = dict((i, char) for char, i in target_token_index.items())
Input : tgm misled ygu
Output : tom missed you
Input : yeu mvst lefve
Output : you must leave

Further Improvements

References

Junior Researcher @ Buddi.ai