In specific software areas like in quantitative finance or else in other mathematical domains, data centric programming typically requires a good balance between three requirements - (1) a solid platform with rich mathematical/statistical functionality (2) having an easy to use, contemporary, programming environment which permits easy and flexible front end code development and (3) an easy to use interface between the two environments.
In this artcile I am going to explain how such a balance can be attained by using two of the best products in their specific worlds - using the rich R library as the mathematical/statistical component but then interfacing with C# for the front end application design. As an interfacing option I banked on using R (D)COM which provides an easy to use interfacing method which keeps you away from spending hours identifying interfacing problems.
The software required for this tutorial is the following:
1. R software (download from here)
2. R (D)COM Interface (download from here)
3. If you dont have a C# IDE you can download Visual Studio Express for free (download from here)
Once you installed all your software we can start focusing on the small example. Albeit a simple example, the example will show the flexibility that one can attain by interfacing R and C#. In the example we have a C# Windows Form which fires a request to R in order to generate a random data set from a standard normal distribution, calculate some basic statistics, and then display the data and results back on our form.
STEP 1 - Creating the Windows Form
In Visual Studio create a new Windows Form application.
STEP 2 - Add project reference to the R (D)COM library
From the solution explorer add reference to the R (D)COM library. In the COM components list you typically find the component name listed as "Repository for R COM Server Instances".
STEP 3 - Add library references to your form code
This is done by adding references as follows:
using StatConnectorCommonLib;
using STATCONNECTORSRVLib;
STEP 4 - Setting and Initializing Connection
Two variables were defined in the form class. The variable rconn of type StatConnector represents the connection to the R COM component. The variable dataSize is used to store the size of the data array that we are going to randomly generate from R.
private StatConnector rconn;
private int dataSize;
Both these variables are initialized in the default constructor of my form, in my case named frmMain.
public frmMain()
{
InitializeComponent();
dataSize = 100;
rconn = new STATCONNECTORSRVLib.StatConnector();
rconn.Init("R");
}
STEP 5 - Invoking R commands and displaying results
The form that I created in my example had the following components:
1. A listbox which is used to present the random numbers generated from R
2. Two labels which present the Mean and Standard Deviation of the data calculated from R
3. A button which kicks off the process
The button clicked handler code is presented below:
private void btnGenData_Click(object sender, EventArgs e)
{
rconn.SetSymbol("sdataSize",dataSize);
//Generate Data in R
rconn.Evaluate("sdata<-rnorm(sdataSize)");
double[] data = (double[])rconn.GetSymbol("sdata");
//Calculate Statistics in R
rconn.Evaluate("saverage<-mean(sdata)");
rconn.Evaluate("sstdev<-sd(sdata)");
double average = (double)rconn.GetSymbol("saverage");
double stdev = (double)rconn.GetSymbol("sstdev");
// Display Data and Results on Windows Form
lbDataList.Items.Clear();
for (int c = 0; c < dataSize; c++)
{
lbDataList.Items.Add(data[c]);
}
lblAverage.Text = "Average: " + average;
lblStdDev.Text = "Standard Deviation: " + stdev;
}
As it can be seen from the above code, the interfacing mechanism is quite simple and intuitive. The only important thing to note is that the COM object returns back general object references which need to be typcasted as required. In the case of the data variable which holds the random data vetor, this is typecasted into a double[] type. In the case of a single return value form average and standard deviation this is typecasted into a double.
Pretty easy right? I found this method really easy to use and keeps you focused on the problem that you wish to solve, rather than spending hours debugging code. Using C# for your front end application also provides a lot of flexibility when it comes to building world class applications and user interfaces. Testing the method with large amounts of data, running in the hundreds of thousands, also proved to be very efficient.
A continuation of this article is available in Part 2
A continuation of this article is available in Part 2