Sunday, November 24, 2013

NA and NULL values in R

As everyone realizes data is getting more and more important, R is often used together with other tools, such as MySQL for data mining purpose. SQL usually treats missing values as a NULL, but when NULL is transferred into R, it can cause issues if we do not pay enough attention. Here are some examples to illustrate how R deals with NULL values and NA values.

# NA has length 1, NULL has length 0
a <- NA; length(a); class(a)
b <- NULL; length(b); class(b)
a <- c(1, 2, NA); length(a); class(a)
b <- c("1", "2", NULL); length(b); class(b)

# both NA and NULL can be assigned a class
as.character(NA); class(as.character(NA))
as.character(NULL); class(as.character(NULL))
class(as.numeric(NULL)); class(as.character(NULL))

# when operated with other strings
# NA is first converted to "NA"
paste(c("a", NA, "b", NULL, "c"), collapse = "+")
paste(as.character(c(1, NA, 2, NULL, 3)), collapse = "+")

# NULL is different from an empty string "", see example
paste(c("a", "", "b", NULL, "c"), collapse = "+")

# NULL can be confirmed using is.null(), but when it is
# given a class, it is not NULL anymore, although the
# length is still zero
a <- NULL; class(a); length(a); is.null(a)
b <- as.numeric(NULL); class(b); length(b); is.null(b)
class(a) <- "numeric"; is.null(a)

No comments:

Post a Comment