The task is to explore the US census population estimates by county
for 2022 from the package usmap
. The data frame
(countypop
) has 3222
rows and 4
variables:
fips
is the 5-digit FIPS code corresponding to the
county;
abbr
is the 2-letter state abbreviation;
county
is the full county name;
pop_2022
is the 2022 population estimate (in number
of people) for the corresponding county.
Each row of the data frame represents a different county or a county equivalent. For the sake of simplicity, when we say a county, that also includes a county equivalent and when we say a state, that also includes the District of Columbia. Answer the following questions.
You will need to modify the code chunks so that the code works within
each of chunk (usually this means modifying anything in ALL CAPS). You
will also need to modify the code outside the code chunk. When you get
the desired result for each step, change Eval=F
to
Eval=T
and knit the document to HTML to make sure it works.
After you complete the lab, you should submit your HTML file of what you
have completed to Canvas before the deadline.
length
and unique
functions.FUNCTION1(FUNCTION2(VARIABLE))
length
and unique
functions.FUNCTION1(FUNCTION2(VARIABLE))
length
and unique
functions.FUNCTION1(FUNCTION2(VARIABLE))
count
number of different county names,
arrange
in descending order and show the first 10
observations.DATANAME %>%
count(COUNTY_VARIABLE) %>%
arrange(ORDER_FUNCTION(n)) %>$
head(NUMBER_OF_OBS_TO_SHOW)
count
number of observations in each state,
arrange
the data in ascending order and show the first
observation.countypop %>%
count(STATE_VARIABLE) %>%
arrange(n) %>%
head(NUMBER_OF_OBS_TO_SHOW)
arrange
the data with pop_2022
in descending
order. The first observation contains the information.pop_2022=countypop$pop_2022
arrange(countypop,ORDER_FUNCTION(POP_VARIABLE))[1,]
countypop %>%
group_by(STATE_VARIABLE) %>%
summarise(total_pop=SUM_FUNCTION(POP_VARIABLE))
filter
the data to keep observations from ‘NC’,
summarise
the data to get average population.countypop %>%
filter(STATE_VARIABLE==NORTH_CAROLINA) %>%
summarise(AVERAGE_FUNCTION(VARIABLE))
countypop %>%
group_by(STATE_VARIABLE) %>%
summarise(county=COUNTY_VARIABLE[which.max(POP_VARIABLE)],MAX_FUNCTION(POP_VARIABLE))