Twitter is making it possible for developers and researchers to study the public conversation… 1 answer below »

Even the smartest students need writing assistance at some point during their academic career. Should you lock yourself in a room and spend the entire weekend trying to write a paper? We promise you that the paper that you pay for won’t be resold or submitted elsewhere. It will also be written according to the instructions that you and your professor provide. Our excellent essays stand out among the rest for a reason. Don’t just take our word, check them out by yourself.


Order a Similar Paper Order a Different Paper

Twitter is making it possible for developers and researchers to study the public conversation

around COVID-19 in real time. This dataset includes a CSV file which contains tweets

extracted from the Twitter website in March 2020. The dataset is large and thus you are

initially required to manipulate it using shell scripting. Once you have reduced the data to a

reasonable size, you are then asked to use R programming language to further analyse and

visualise the results.

Task A: Investigating Twitter Data using shell commands

Download the file covid-data.zip from the link provided above. Use the Unix shell to

manipulate the file and answer the following questions.

1. Decompress the file. How big is it?

2. What delimiter is used to separate the columns in the file? Write the code to show

how many columns are there?

3. How many tweets are there in total in the file?

4. Assuming that the data is sorted, what is the date range of the tweets? (date of first and last

tweet)

5. When was the first mention of the term “COVID-19” in your dataset (notice that we look for

COVID-19 with capital letters here)? What is the user_id, text and post date of this tweet?

6. How many times did the hashtag #coronavirus or #COVID-19 appear in the file in the

given form?( If any of these words appear more than once in a line, you need to count

all its occurrences to answer this question properly)

7. As per the dataset, how many unique users (user_ids) have tweeted? List the top 10

most frequent Twitter users (user_ids) whose tweets are in English (lang = ‘en’)?

8. How many times does the word “Advertisers” appear in the source column? What is

the full name of the source which contains the text “Advertisers”? Print the text and

post date of the first and last English tweet posted from this source.

9. Filter all the tweets with lang = ‘en’ which contain the term ‘Corona’ or ‘Covid’.

Export the tweets to a new file named “covid19Final.csv”. Ensure that you restrict the

tweets only to verified users who have retweeted at least 20 times. Ensure that the file

“covid19Final.csv” contains the column names as well.

Attachments:

Writerbay.net

Do you need help with this or a different assignment? In a world where academic success does not come without efforts, we do our best to provide the most proficient and capable essay writing service. After all, impressing professors shouldn’t be hard, we make that possible. If you decide to make your order on our website, you will get 15 % off your first order. You only need to indicate the discount code GET15.


Order a Similar Paper Order a Different Paper