you used Google sheets for data manipulation.? In week 4, you learned how to use Tidyverse for data manipulation.? Compare the us

24 Mar you used Google sheets for data manipulation.? In week 4, you learned how to use Tidyverse for data manipulation.? Compare the us

Posted at 05:04h in Business & Finance /Accounting by

you used Google sheets for data manipulation. In week 4, you learned how to use Tidyverse for data manipulation.

Compare the use of Googlesheets and Tidyverse for data manipulation. Be specific about how you can do specific data manipulation tasks in Tidyverse and how these tasks can be done in Googlesheets.

Review the Tidyverse cheat sheet for a summary of data manipulation commands. This cheat sheet will also be useful for the practice problems and other assignments/projects

Tidyverse2BCheat2BSheet.pdf

R For Data Science Cheat Sheet Tidyverse for Beginners

Learn More R for Data Science Interactively at www.datacamp.com

Tidyverse

DataCamp Learn R for Data Science Interactively

The tidyverse is a powerful collection of R packages that are actually data tools for transforming and visualizing data. All packages of the tidyverse share an underlying philosophy and common APIs.

The core packages are:

• ggplot2, which implements the grammar of graphics. You can use it to visualize your data.

• dplyr is a grammar of data manipulation. You can use it to solve the most common data manipulation challenges.

• tidyr helps you to create tidy data or data where each variable is in a column, each observation is a row end each value is a cell.

• readr is a fast and friendly way to read rectangular data.

• purrr enhances R’s functional programming (FP) toolkit by providing a complete and consistent set of tools for working with functions and vectors.

• tibble is a modern re-imaginging of the data frame.

• stringr provides a cohesive set of functions designed to make working with strings as easy as posssible

• forcats provide a suite of useful tools that solve common problems with factors.

You can install the complete tidyverse with:

Then, load the core tidyverse and make it available in your current R session by running:

Note: there are many other tidyverse packages with more specialised usage. They are not loaded automatically with library(tidyverse), so you’ll need to load each one with its own call to library().

ggplot2

> install.packages("tidyverse")

> iris %>% Select iris data of species filter(Species=="virginica") "virginica" > iris %>% Select iris data of species filter(Species=="virginica", "virginica" and sepal length Sepal.Length > 6) greater than 6.

dplyr

Filter

> library(tidyverse)

Useful Functions

Arrange

Mutate

Summarize

> tidyverse_conflicts() Conflicts between tidyverse and other packages > tidyverse_deps() List all tidyverse dependencies > tidyverse_logo() Get tidyverse logo, using ASCII or unicode characters > tidyverse_packages() List all tidyverse packages > tidyverse_update() Update tidyverse packages

Loading in the data > library(datasets) Load the datasets package > library(gapminder) Load the gapminder package > attach(iris) Attach iris data to the R search path

filter() allows you to select a subset of rows in a data frame.

> iris %>% Sort in ascending order of arrange(Sepal.Length) sepal length > iris %>% Sort in descending order of arrange(desc(Sepal.Length)) sepal length

arrange() sorts the observations in a dataset in ascending or descending order based on one of its variables.

> iris %>% Filter for species "virginica" filter(Species=="virginica") %>% then arrange in descending arrange(desc(Sepal.Length)) order of sepal length

Combine multiple dplyr verbs in a row with the pipe operator %>%:

mutate() allows you to update or create new columns of a data frame.

> iris %>% Change Sepal.Length to be mutate(Sepal.Length=Sepal.Length*10) in millimeters > iris %>% Create a new column mutate(SLMm=Sepal.Length*10) called SLMm

Combine the verbs filter(), arrange(), and mutate(): > iris %>% filter(Species=="Virginica") %>% mutate(SLMm=Sepal.Length*10) %>% arrange(desc(SLMm))

> iris %>% Summarize to find the summarize(medianSL=median(Sepal.Length)) median sepal length > iris %>% Filter for virginica then filter(Species=="virginica") %>% summarize the median summarize(medianSL=median(Sepal.Length)) sepal length

summarize() allows you to turn many observations into a single data point.

> iris %>% filter(Species=="virginica") %>% summarize(medianSL=median(Sepal.Length), maxSL=max(Sepal.Length))

You can also summarize multiple variables at once:

group_by() allows you to summarize within groups instead of summarizing the entire dataset:

> iris %>% Find median and max group_by(Species) %>% sepal length of each summarize(medianSL=median(Sepal.Length), species maxSL=max(Sepal.Length)) > iris %>% Find median and max filter(Sepal.Length>6) %>% petal length of each group_by(Species) %>% species with sepal summarize(medianPL=median(Petal.Length), length > 6 maxPL=max(Petal.Length))

Scatter plot

> iris_small <- iris %>% filter(Sepal.Length > 5) > ggplot(iris_small, aes(x=Petal.Length, Compare petal y=Petal.Width)) + width and length geom_point()

Scatter plots allow you to compare two variables within your data. To do this with ggplot2, you use geom_point()

Additional Aesthetics

> ggplot(iris_small, aes(x=Petal.Length, y=Petal.Width, color=Species)) + geom_point()

• Color

• Size > ggplot(iris_small, aes(x=Petal.Length, y=Petal.Width, color=Species, size=Sepal.Length)) + geom_point()

Faceting > ggplot(iris_small, aes(x=Petal.Length, y=Petal.Width)) + geom_point()+ facet_wrap(~Species)

Line Plots

Bar Plots

Histograms

Box Plots

> by_year <- gapminder %>% group_by(year) %>% summarize(medianGdpPerCap=median(gdpPercap)) > ggplot(by_year, aes(x=year, y=medianGdpPerCap))+ geom_line()+ expand_limits(y=0)

> by_species <- iris %>% filter(Sepal.Length>6) %>% group_by(Species) %>% summarize(medianPL=median(Petal.Length)) > ggplot(by_species, aes(x=Species, y=medianPL)) + geom_col()

> ggplot(iris_small, aes(x=Petal.Length))+ geom_histogram()

> ggplot(iris_small, aes(x=Species, y=Sepal.Width))+ geom_boxplot()

Our website has a team of professional writers who can help you write any of your homework. They will write your papers from scratch. We also have a team of editors just to make sure all papers are of HIGH QUALITY & PLAGIARISM FREE. To make an Order you only need to click Ask A Question and we will direct you to our Order Page at WriteEdu. Then fill Our Order Form with all your assignment instructions. Select your deadline and pay for your paper. You will get it few hours before your set deadline.

Fill in all the assignment paper details that are required in the order form with the standard information being the page count, deadline, academic level and type of paper. It is advisable to have this information at hand so that you can quickly fill in the necessary information needed in the form for the essay writer to be immediately assigned to your writing project. Make payment for the custom essay order to enable us to assign a suitable writer to your order. Payments are made through Paypal on a secured billing page. Finally, sit back and relax.

Do you need an answer to this or any other questions?

Do you need help with this question?

Get assignment help from WriteEdu.com Paper Writing Website and forget about your problems.

WriteEdu provides custom & cheap essay writing 100% original, plagiarism free essays, assignments & dissertations.

With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.

Chat with us today! We are always waiting to answer all your questions.

Click here to Place your Order Now

24 Mar you used Google sheets for data manipulation.? In week 4, you learned how to use Tidyverse for data manipulation.? Compare the us

About Us

Quick Links

Recent Posts