Login to your account

Username *
Password *
Remember Me

Create an account

Fields marked with an asterisk (*) are required.
Name *
Username *
Password *
Verify password *
Email *
Verify email *
Captcha *
Reload Captcha
NLP

An NLP Approach to Analyzing Twitter, Trump, and Profanity

Who swears more? Do Twitter users who mention Donald Trump swear more than those who mention Hillary Clinton? Let’s find out by taking a natural language processing approach (or, NLP for short) to analyzing tweets.

This walkthrough will provide a basic introduction to help developers of all background and abilities get started with the NLP microservices available on Algorithmia. We’ll show you how to chain them together to perform light analysis on unstructured text. Unfamiliar with NLP? Our gentle introduction to NLP will help you get started.

We know that getting started with a new platform or developer tool is an investment in time and energy. Sometimes it can be hard to find the information you need in order to start exploring on your own. That’s why we’ve centralized all our information in the Algorithmia Developer Center and API Docs, where users will find helpful hints, code snippets, and getting started guides. These guides are designed to help developers integrate algorithms into applications and projects, learn how to host their trained machine learning models, or build their own algorithms for others to use via an API endpoint.

Now, let’s tackle a project using some algorithms to retrieve content, and analyze it using NLP. What better place to start than Twitter, and analyzing our favorite presidential candidates?

The algorithm description provides information about the input and output data structures expected, as well as the details regarding any other requirements. For instance, Retrieve Tweets with Keyword requires your Twitter API authentication keys.

You’ll need a free Algorithmia account to complete this project. Sign up for free and receive an extra 10,000 credits. Overall, the project will consist of processing around 700 tweets or so with emoticons and other special characters stripped out. This means if a tweet only contained URL’s and emoticons then it won’t be analyzed. Once we pull our data from the Twitter API, we’ll clean it up with some regex, remove stop words, and then find our swear words.

Okay, let’s go over the obvious parts of the code snippet. This algorithm takes a nested dictionary called ‘input’ that contains the keys: ‘query’, ‘numTweets’ and ‘auth’ which is a dictionary itself. The key ‘query’ is set as a global variable called q_input and holds the system argument that is passed when executing the script. In our case it will hold a presidential nominee name. The key ‘numTweets’ is set to the number of tweets you want to extract and the dictionary ‘auth’ holds the Twitter authentication keys and tokens that you got from Twitter.

As you write the pull_tweets() function, pay attention to the line that sets the variable ‘client’ to ‘Algorithmia.client(algorithmia_api_key)’. This is where you pass in your API key that you were assigned when you signed up for an account with Algorithmia. If you don’t recall where to find that it is in the My Profile page in the Credentials section.

Next notice the variable ‘algo.’ This is where we pass in the path to the algorithm we’re using. Each algorithm’s documentation will give you the appropriate path in the code examples section at the bottom of the algorithm page.

Rate this item
(0 votes)
Last modified on Wednesday, 18 April 2018 16:07
Super User

Software is a great combination of artistry and engineering 

                                                                 Bill Gates.

Leave a comment

Make sure you enter all the required information, indicated by an asterisk (*). HTML code is not allowed.