Showing posts with label latticeExtra. Show all posts
Showing posts with label latticeExtra. Show all posts

Transforming subsets of data in R with by, ddply and data.table

Transforming data sets with R is usually the starting point of my data analysis work. Here is a scenario which comes up from time to time: transform subsets of a data frame, based on context given in one or a combination of columns.

As an example I use a data set which shows sales figures by product for a number of years:

df <- data.frame(Product=gl(3,10,labels=c("A","B", "C")), 
Year=factor(rep(2002:2011,3)),
Sales=1:30)
head(df)
## Product Year Sales
## 1 A 2002 1
## 2 A 2003 2
## 3 A 2004 3
## 4 A 2005 4
## 5 A 2006 5
## 6 A 2007 6

I am interested in absolute and relative sales developments by product over time. Hence, I would like to add a column to my data frame that shows the sales figures divided by the total sum of sales in each year, so I can create a chart which looks like this:

There are lots of ways of doing this transformation in R. Here are three approaches using: Read more »

Waterfall charts in style of The Economist with R

Waterfall charts are sometimes quite helpful to illustrate the various moving parts in financial data, in particular when I have positive and negative values like a profit and loss statement (P&L). However, they can be a bit of a pain to produce in Excel. Not so in R, thanks to the waterfall package by James Howard. In combination with the latticeExtra package it is nearly a one-liner to produce a good looking waterfall chart that mimics the look of The Economist:

Example of a waterfall chart in R
library(latticeExtra)
library(waterfall)
data(rasiel) # Example data of the waterfall package
rasiel
# label value subtotal
# 1 Net Sales 150 EBIT
# 2 Expenses -170 EBIT
# 3 Interest 18 Net Income
# 4 Gains 10 Net Income
# 5 Taxes -2 Net Income

asTheEconomist(
waterfallchart(value ~ label, data=rasiel,
groups=subtotal, main="P&L")
)
Of course you can create a waterfall chart also with ggplot2, the Learning R blog has a post on this topic.