These are exercises with plotly and data from the State Energy Data System (SEDS).
The original data source is available on a state by state basis, or for the US as a whole, in a “wide” format.
The code below downloads the data, imports it and tidies it into a long format. This code also includes unit codes, which, at the time of writing, are of my own invention.
The data wrangling code is hidden in this file so that the plots can be more easily seen, as they are the focus of the coursework. The code can be seen in the Rmd file in this Github repository.
#Subset data by MSN so that I can plot for MSN
#Note that creating the matrix with a list data type is important, else you can't add dataframes into cells
msn_subsets_mtrx <- matrix(list(), nrow = 2, ncol = nrow(msn_codes))
col_counter <- 1
for (msn_code in msn_codes$MSN) {
msn_subsets_mtrx[[1, col_counter]] <- msn_code
msn_subsets_mtrx[[2, col_counter]] <- subset(all_states_long_df, MSN == msn_code)
col_counter <- col_counter + 1
}
There are two types of plots here:
These plots show the consumption for all states for a given consumable, identified by a MSN code, across the sampled time period.
#This syntax for using plotly in a loop comes from here:
#https://github.com/ropensci/plotly/issues/273
line_plot_gatherer <- htmltools::tagList()
col_counter <- 1
for (msn_code in msn_codes$MSN) {
plot_df <- msn_subsets_mtrx[[2, col_counter]]
if (nrow(plot_df) > 0) {
line_plot_gatherer[[col_counter]] <- plot_ly(plot_df, x = ~Year, y = ~value, color = ~State, type = "scatter", mode = "lines") %>%
layout(title = paste("All US States -", msn_codes[msn_codes$MSN == msn_code, 2], "-", msn_codes[msn_codes$MSN == msn_code, 4]))
}
col_counter <- col_counter + 1
#Stop before too many are created, just to save time and space
if (col_counter > 10) {
break
}
}
line_plot_gatherer
These plots should show the consumption for all states for a given consumable, identified by a MSN code, for a given year.
There is a bug with, possibly, plotting multiple maps that this code illustrates. This could be user error also. It may be related to: https://github.com/ropensci/plotly/issues/273
There are two main issues:
I am keeping this here as an example I can refer to if needed, so I want to publish it also. The coursework deadline is today.
# Make state borders red
borders <- list(color = toRGB("red"))
# Set up some mapping options
map_options <- list(
scope = 'usa',
projection = list(type = 'albers usa'),
showlakes = TRUE,
lakecolor = toRGB('white')
)
map_plot_gatherer <- htmltools::tagList()
col_counter <- 1
for (msn_code in msn_codes$MSN) {
plot_df <- msn_subsets_mtrx[[2, col_counter]]
#Select a specific year
year <- "2000"
plot_df <- subset(plot_df, Year == year)
if (nrow(plot_df) > 0) {
map_plot_gatherer[[col_counter]] <- plot_ly(z = ~plot_df$value, text = ~plot_df$value, locations = ~plot_df$State, type = 'choropleth', locationmode = 'USA-states', color = plot_df$Value, colors = 'Blues', marker = list(line = borders)) %>%
layout(title = paste("Debug sum value is -", sum(subset(plot_df, MSN == msn_code)$value)), geo = map_options)
#The above is a debug title to see whether I realy am using different data sets
#This is what the title should be:
#layout(title = paste(msn_codes[msn_codes$MSN == msn_code, 2], "-", year, "-", msn_codes[msn_codes$MSN == msn_code, 4]), geo = map_options)
}
col_counter <- col_counter + 1
#If more than two maps are created, then there is no data, just blank maps. See:
#https://github.com/ropensci/plotly/issues/273
#Stopping when col_counter > 2 works, to show two maps. Trying to show three did not work.
if (col_counter > 3) {
break
}
}
map_plot_gatherer