ggplot2 - ggplot in R with fortify takes too long to process small geospatial data -


i trying use ggplot draw map of canada , colour code each region based on total sales. geospatial file gadm , contains 12 provinces (level 1). when fortify data resulting data.frame on 4 million rows. when when try draw map ggplot seems hang. i've left 30 minutes , had give up.

is problem size of result of fortify? don't know how reduce size. i've tried playing 'region' argument in fortify causes fortify appear hang.

i have included code , url download data working with.

require(dplyr)  # loaded from: https://raw.githubusercontent.com/technology-hatchery/rcode/master/data/sample%20-%20superstore%20sales%20(excel).csv orders <- read.csv(file='data/orders.csv', sep=',', header=true, na.strings = '') orders$order.date <- as.date(orders$order.date, '%m/%d/%y') orders$order.priority <- as.factor(orders$order.priority) orders$customer.name <- as.character(orders$customer.name) orders$ship.date <- as.date(orders$ship.date, '%m/%d/%y') orders$order.total <- orders$unit.price * orders$order.quantity orders <- tbl_df(orders)  require(raster) require(ggplot2) require(rcolorbrewer) require(rgdal) require(rgeos)  # map gadm: http://biogeo.ucdavis.edu/data/gadm2.7/rds/can_adm1.rds canada <- readrds('../../geo/gadm/canada/can_adm1.rds') canada <- sptransform(canada, crs("+proj=longlat +datum=wgs84"))  # add data spatial polygon data.frame canada.df <- fortify(canada) summary(canada.df)  nrow(canada.df)  # [1] 4005898  # build region list , add spatial df provinces <- canada@data %>% dplyr::select(objectid, name_1) %>% dplyr::rename(id = objectid, province = name_1) head(provinces)  # add total sales spatial df provinceorders <- orders %>% mutate(province = as.character(province)) %>%        left_join(., provinces, by='province') %>%        group_by(id, province) %>%        dplyr::summarise(total = sum(order.total)) %>%        dplyr::select(id, total) head(provinceorders)  canada.df <- merge(canada.df, provinceorders, by='id', all.x=true) canada.df <- arrange(canada.df, order, group) head(canada.df)  ggplot() + geom_polygon(data=canada.df, aes(x=long, y=lat, group=group, fill=total), color='white') + scale_fill_gradient(high='red', low= 'blue') #geom_text(aes(label=province, x=long, y=lat)) 

try shapefile noaa instead. has provinces doesn't have super-precise coastline polygons (which aren't needed):

library(rgdal) library(ggplot2) library(ggthemes)  url <- "http://www.nws.noaa.gov/geodata/catalog/national/data/province.zip" fil <- basename(url) if (!file.exists(fil)) download.file(url, fil) fils <- grep("shp", unzip(fil), ignore.case=true, value=true) ca <- readogr(fils, ogrlistlayers(fils)[1])  ca_map <- fortify(ca, region="name")  gg <- ggplot() gg <- gg + geom_map(data=ca_map, map=ca_map,                     aes(x=long, y=lat, map_id=id),                     color="black", fill="white", size=0.15) gg <- gg + coord_map("lambert", 44, 85) gg <- gg + theme_map() gg 

enter image description here

a system.time(ca_map <- fortify(ca, region="name")) shows:

##    user  system elapsed  ##   0.517   0.005   0.523  

pretty consistently me.


Comments

Popular posts from this blog

c# - Binding a comma separated list to a List<int> in asp.net web api -

how to prompt save As Box in Excel Interlop c# MVC 4 -

xslt 1.0 - How to access or retrieve mets content of an item from another item? -