Order for this Paper or similar Assignment Help Service

Fill the order form in 3 easy steps - Less than 5 mins.

Posted: March 6th, 2022

Assessment 2: Visualization and data processing – Total marks 40

Assessment 2: Visualization and data processing – Total marks 40
Assessed by:
Grade: /40
Outline
The following exercises are designed to assess your understanding of concepts, implementation, and interpretation of topics in Visualization and Data Processing. Some questions may require you to search and use R functions that we have not used so far. In all following questions submit codes and output.
The questions in this assessment may have multiple correct solutions. Almost no statistical background is presumed knowledge for this assessment. All methods required for solution are available on the content pages of Weeks 2-5 of this subject. Some of them have been covered in detail during collaborate sessions.
Submissions
This assessment consists of 11 questions with several sub-questions. Insert code, plots and explanations/justifications in the provided text boxes where indicated. Do not remove the headings in the text boxes. Answers outside the box won’t be marked. Note that you should not need more space than is provided (the text boxes).
Change the file name to your first and last name when submitting to Learn JCU.
Submit as a Word file or a pdf file.

Visualization:
Import the data oneworld.csv (saved in https://drive.google.com/file/d/1dJnK9froCCxCn1PFEbv6svLdKnhRiFCL/view?usp=sharing) into R. The objective in this section is exploring the relationship between GDP categories, Infant mortality and regions.
Q1. Insert your R code to:
Create a new ordinal variable called GDPcat with three categories, “Low” “Medium” and “High”, derived from the variable GDP with:
• The proportion of countries in each GDPcat category is approximately “Low” 40%, “Medium” 40% and “High” 20%.
• The “Low” category has countries with the lowest GDP values and the “High” category has countries with the highest GDP values.
• Remove any missing observations.
Q2. Insert your R code, Plot, and interpretation of the plot:
Using the ggplot2 library, visualise the relationship between GDPcat and Infant.mortality, stratified by Regions, on a single plot. Comment on your plot.
Data Processing: Section Marks 15
Q3. Insert your R code to: Marks (4)
Write an R function to identify the proportion of missing observations in a variable or column of tabular data.
Q4: Insert the code to: Marks (2)
Implement the function from Q3 across all variables of the dataset airquality. This dataset is available in R. Print a list of the variable name with the proportion missing observations in each variable.
Q5. Insert your justification: Marks (2)
Use airquality dataset available in R. Specify a variable from the airquality dataset for univariate missing value imputation. Justify your variable choice based on the count or proportion of missing observations, noting that univariate imputation reduces the natural variation of a variable.
Using base R or dplyr functions (no additional libraries) replace all missing observations in the chosen variable from above with an imputation value. Justify the choice of replacement value. Hint: Read the appropriate section on your Weekly content page to perform this task.
Q6. Insert the code and justification to: Marks (3)
Using base R or dplyr functions (no additional libraries) replace all missing observations in the chosen variable from Q5 with an imputation value. Justify the choice of replacement value. Hint: Read the appropriate section on your Weekly content page to perform this task.
Q7. Insert the code, output and explanation: Marks (4)
Compare the mean and standard deviation of the chosen variable from Q5 before and after imputation. Provide an explanation of the comparison.

Text Analytics: Section Marks 15
Mysterydocs.RData is a collection of unstructured text documents (can be found https://drive.google.com/file/d/1FU2bTUMtqrFizpEQwoz1MQ5Yw2AHRgwe/view?usp=sharing).
The response to the questions below must include comments, where indicated.
Q8. Insert the code and output to: Mark (1)
Import the Mysterydocs.RData file into R and identify the number of documents in the docs dataset.
Q9. Insert the code and output to: Marks (4)
Using methods of Week 5 Topic 2, clean the collection of texts and convert it into tabular data. Use at least 5 cleaning steps, including stemming. Display the last six rows and first five columns (only) of the cleaned tabular data that you created.
Q10. Insert your R code and plot: Marks (3)
Create a subset of the cleaned tabular data from Q9 retaining only those words that have occurred at least 200 times within the entire corpus. Use a visualization tool to show the frequency distribution of words of the 50 most frequent words in the subset data. Hint: Select an appropriate visualization tool from your learnings of Week 3
Q11. Insert your R code, plot, and interpretation of the plot: Marks (7)
Visualise a similarity matrix between documents derived from the cleaned data in Q9. Comment on the visualisation and noting any obvious structure in the similarity matrix as depicted in the plot. For visualisation of the similarity matrix, you may use R functions such as levelplot() or image()or any other suitable plotting function. You would have to research the implementation of these functions

———–

Assessment 2: Visualization and data processing – 40 points total

/40 graded

Outline

The activities below are intended to examine your comprehension of concepts, implementation, and interpretation of Visualization and Data Processing topics. Some queries may require you to look up and use R functions that we haven’t covered yet. Submit codes and output for all of the following questions.

This assessment’s questions may have many right answers. For this assessment, almost no statistical background is assumed. All methods necessary for solution are given on the content pages of this subject’s Weeks 2-5. Some of them have been thoroughly discussed during collaborative meetings.

Submissions

This test consists of 11 questions, each with multiple sub-questions. Fill in the blanks with code, charts, and explanations/justifications.

Order | Check Discount

Tags: Australia Nursing Writing Help, Case Study Help, criminal justice essay writers, criminal law research essay writers, criminology paper writing service, essay roo

Assignment Help For You!

Special Offer! Get 20-30% Off on Every Order!

Why Seek Our Custom Writing Services

Every Student Wants Quality and That’s What We Deliver

Graduate Essay Writers

Only the finest writers are selected to be a part of our team, with each possessing specialized knowledge in specific subjects and a background in academic writing..

Affordable Prices

We balance affordability with exceptional writing standards by offering student-friendly prices that are competitive and reasonable compared to other writing services.

100% Plagiarism-Free

We write all our papers from scratch thus 0% similarity index. We scan every final draft before submitting it to a customer.

How it works

When you opt to place an order with Nursing StudyBay, here is what happens:

Fill the Order Form

You will complete our order form, filling in all of the fields and giving us as much instructions detail as possible.

Assignment of Writer

We assess your order and pair it with a custom writer who possesses the specific qualifications for that subject. They then start the research/write from scratch.

Order in Progress and Delivery

You and the assigned writer have direct communication throughout the process. Upon receiving the final draft, you can either approve it or request revisions.

Giving us Feedback (and other options)

We seek to understand your experience. You can also peruse testimonials from other clients. From several options, you can select your preferred writer.

Expert paper writers are just a few clicks away

Place an order in 3 easy steps. Takes less than 5 mins.

Calculate the price of your order

You will get a personal manager and a discount.
We'll send you the first draft for approval by at
Total price:
$0.00