{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Machine Learning Accelerator - Natural Language Processing - Lecture 1\n", "\n", "## Bag of Words Method\n", "\n", "In this notebook, we go over the Bag of Words (BoW) method to convert text data into numerical values, that will be later used for predictions with machine learning algorithms.\n", "\n", "To convert text data to vectors of numbers, a vocabulary of known words (tokens) is extracted from the text, the occurence of words is scored, and the resulting numerical values are saved in vocabulary-long vectors.\n", "\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", "
\n", " \n", " \n", " " ], "text/plain": [ "