Testing and benchmarking some of the existing NLP libraries in Apache Spark
-
Updated
Jan 11, 2019 - Scala
Testing and benchmarking some of the existing NLP libraries in Apache Spark
A scalable and real-time data pipeline for processing, analyzing, and visualizing Twitter data.
Final Project for Harvard's Scala for Big Data Systems course
This project demonstrates the use of Spark NLP to extract relationships between named entities and their Part of Speech (POS) tags. The objective is to process a given dataset using Spark and apply various NLP techniques through a Spark ML pipeline.
Add a description, image, and links to the spark-nlp topic page so that developers can more easily learn about it.
To associate your repository with the spark-nlp topic, visit your repo's landing page and select "manage topics."