Saturday, November 30, 2013

What is a data frame in R

Data frames in R are simply lists of vectors (with some extra restriction and attributes). Any operation on list of vectors can be called "as-is" on data frames. For example:
a = list(v1 = c(1, 2, 3), v2 = c(4, 5, 6))
b = data.frame(v1 = c(1, 2, 3), v2 = c(4, 5, 6))

Then, a[[1]], a[1], a["v1"], a$v1 are all legal calls to a, and so are to b. We can even check
> is.list(b)
[1] TRUE

and they contain same objects
> objects(a)
[1] "v1" "v2"
> objects(b)
[1] "v1" "v2"

What attributes do data frame have in addition? row names and a class name of "data.frame"
> attributes(a)
$names
[1] "v1" "v2"

> attributes(b)
$names
[1] "v1" "v2"

$row.names
[1] 1 2 3

$class
[1] "data.frame"

So, can we give such attributes of a list of variables and fully convert it into a data frame? Yes!

> is.data.frame(a)
[1] FALSE
> class(a) <- "data.frame"
> row.names(a) <- c(1,2,3)
> is.data.frame(a)
[1] TRUE

Just one extra requirement (obviously): each vector in the list must have the same length.

No comments:

Post a Comment