Profiling R
===========

Installation:
#############
Dependency:PAPI
===============
The PAPI library is used for measuring hardware performance counters. Our experiments require this library be installed for profiling R. More about PAPI can be found in http://icl.cs.utk.edu/papi/index.html.

1. Untar "papi-5.3.0.tar.gz"
2. Follow the instructions in the INSTALL.txt file to compile and install PAPI into a folder of your choice. Use the --prefix option in ./configure to specify a folder of your choice.

Changes to R source code:
========================
1. Untar "R-3.0.2.tar.gz".
2. Apply the patch "profilingR.patch" to "R-3.0.2" folder just untarred. To do so, cd to the untarred folder and use the command - "patch -p1 < profilingR.patch"
3. Modify the file "Makefile.in" in "R-3.0.2/src/main". Modify the placeholder <PAPI INSTALL FOLDER> to your PAPI installation folder in the following variables.
	(a) PAPI_CPPFLAGS
	(b) PAPI_LIBS
4. Follow the instructions in the INSTALL file in the "R-3.0.2" folder to compile and install R into a folder of your choice. 

Details on the R source code changes:
-------------------------------------
1. The file "main.c" under "R-3.0.2/src/main/" folder has been modified at the "eval" location of the Read-Eval-Print-Loop to profile 2 performance counters using PAPI. These performance counters can be changed based on the micro-architecture to profile the required events.
2. The file "Makefile.in" has been modified to include the PAPI libraries.


Running Experiments:
###################
For profiling, just run R using <R SOURCE LOCATION>/bin/R. The interpreter will open up. Any line of R source code you type in will be profiled and the performance counters will be printed. 

The list of performance counters used in the paper are provided in the file "performance events - Nehalem". The file "main.c" under "R-3.0.2/src/main" folder can be modified with these events to profile a different set of events.

A script to automate this is also provided under the "experiments" folder. 
1. The folder "experiments" contains files "main1.c" to "main12.c", each of which profiles 2 different hardware performance events based on the Nehalem micro-architecture.
2. The "script" file needs to be modified to point to your "R source folder", the "R program you wish to profile", the "output filename" and the "experiments folder"
3. Running the "script" creates a file with the profile details of every line of R code in your program. 

Disclaimer:
##########
1. The experiments in the paper have been tried on an Intel Xeon E7-4850 processor based on the Nehalem micro-architecture. Hence, the performance counters used also pertain to the that micro-architecture. If you are running it on a machine with a different micro-architecture, please do change the performance counters pertaining to that micro-architecture.
2. The changes to the R source code pertain to the R version R-3.0.2. If using a different version, similar changes need to be made at the appropriate location in the source code. The patch file "profilingR.patch" indicates the changes made to the R source code.

Queries:
########
In case of ANY queries on reproducing or running the experiments, please contact jignesh@cs.wisc.edu or shrirams@cs.wisc.edu 



