Skip to content
January 17, 2011 / Rohit

Few Notes on R

I was introduced to the R programming language as part of our data mining course. R was designed to be used by statisticians and it shows. It’s an interpreted language with an interactive shell, much like Python’s.

To install R on Ubuntu you would do:

sudo apt-get install r-base

and run the R shell by typing “R” (capital) in the command line.

Things which I like about the language:

  1. You can get help about any function by just prefixing it with a “?”. For example:
    ?hist
    

    will bring the help for the “hist” function, which plots histograms.

  2. Generating random numbers is easy. To generate a 100 random numbers, between 1-1000, and store it in “x”, you would do:
    x <- sample(1:1000, 100)
    x
      [1] 770 130 123 655 122  77 928 159 680 543 716 939 355 581 394 551 626 408
     [19] 637 650 925 606 299 107 820 584 800 880 918 887 327 480 174 414 720 700
     [37] 316 628 422 287 603 339 445  18 967  44 425 448 209  82 696  53 613 946
     [55] 527 686 192 515  60 915 773  55 644 404 767 745 447   6 860 485 549 502
     [73] 499 537 450 897  70 917 160 709 592 297 125  88 239 998 933 761 532 703
     [91] 574 120 139 207 470 126 597 863 617 458
    
  3. You can plot pretty graphs without bothering about the windowing system or graphics. For example, the following code plots the barplot for 100 dice rolls:
    dice <- sample(1:6, 100, replace=TRUE)
    # we put replace=TRUE as the number of samples required is more than the sample range -- [1,6]
    barplot(table(dice), col="violet")


    Figure 1: Bar Plot in R

  4. R is great at vector manipulation. For example, squaring a vector’s contents is done by:
    # create a vector from [1,10]
    x <- c(1:10)
    # square its contents and store it back
    x <- x * x
    x
     [1]   1   4   9  16  25  36  49  64  81 100
  5. It’s too easy to serialize and deserialize data. For example, to save the squared vector from the previous example, you would do:
    x <- c(1:10)
    x <- x^2
    # save x in file mydata.RData. The extension is necessary.
    save(x, file="mydata.RData")
    # remove x from the current set of variables
    rm(x)
    # load it back again
    load(file="mydata.RData")
    x
     [1]   1   4   9  16  25  36  49  64  81 100

And I haven’t even covered all the built-in functions and libraries yet. In fact you would be amazed as to the number of functions which are already exposed in the shell. For example, functions starting with “a”:

 a [TAB][TAB]
Display all 228 possibilities? (y or n)

It’s not tough to learn and I like it.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: