Word Embedding Visualization

This project visualizes how neural networks learn word embeddings.

It uses a PyTorch implementation of a Neural Network to learn word embeddings and predict part-of-speech (POS) tags. The network consists of an embedding layer and a linear layer.

The training examples contain sentences where each word is associated with the correct POS tag. The dictionary used for training consists of only ten words: the, a, woman, dog, apple, book, reads, eats, green, and good. The POS tags are determiner, noun, verb and adjective.

The goal of the network is to learn a two-dimensional word embedding for each word. In practical applications the dimensionality would of course be higher. Only two dimensions are used to allow easy visualization of the training process on an x/y graph.

Initially, the coordinates of all words are randomly initialized. Through backpropagation, in each training iteration, the neural network adjusts the coordinates of each word slightly. Its aim is to optimize the words' positions in space so that they become more suitable for predicting the part-of-speech accurately.

Visualizing the learning of word embeddings

We start with ten words that are randomly initialized in a 2-dimensional vector space. As the model learns, it step-by-step optimizes the word's coordinates (their embeddings), so that words with the same POS are located in the same area.

Visualizing the learning of prediction boundaries

In this visualization, you can additionally see the prediction boundaries.

Another run, with a very different outcome

The last example shows that--depending on the random initializations--different representations and classification boundaries are learned:

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
videos		videos
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
embedding_visualization.py		embedding_visualization.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Word Embedding Visualization

Visualizing the learning of word embeddings

Visualizing the learning of prediction boundaries

Another run, with a very different outcome

About

Uh oh!

Releases

Packages

Languages

License

txtData/embeddingVis

Folders and files

Latest commit

History

Repository files navigation

Word Embedding Visualization

Visualizing the learning of word embeddings

Visualizing the learning of prediction boundaries

Another run, with a very different outcome

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages