R Type Provider


Reading and writing RData files

When using R, you can save and load data sets as *.rdata files. These can be easily exported and consumed using the R provider too, so if you want to perform part of your data acquisition, analysis and visualization using F# and another part using R, you can easily pass the data between F# and R as *.rdata files.

Passing data from R to F#

Let's say that you have some data in R and want to pass them to F#. To do that, you can use the save function in R. The following R snippet creates a simple *.rdata file containing a couple of symbols from the sample volcano data set:

1: 
2: 
3: 
4: 
5: 
require(datasets)
volcanoList <- unlist(as.list(volcano))
volcanoMean <- mean(volcanoList)
symbols <- c("volcano", "volcanoList", "volcanoMean")
save(list=symols, file="C:/data/sample.rdata")

To import the data on the F# side, you can use the RData type provider that is available in the RProvider namespace. It takes a static parameter specifying the path of the file (absolute or relative) and generates a type that exposes all the saved values as static members:

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
9: 
open RProvider

type Sample = RData<"data/sample.rdata">
let sample = Sample()

// Easily access saved values
sample.volcano
sample.volcanoList
sample.volcanoMean

When accessed, the type provider automatically converts the data from the R format to F# format. In the above example, volcanoList is imported as float[] and the volcanoMean value is a singleton array. (The provider does not detect that this is a singleton, so you can get the value using sample.volcanoMean.[0]). For the sample.volcano value, the R provider does not have a default conversion and so it is exposed as SymbolicExpression.

When you have a number of *.rdata files containing data in the same format, you can pick one of them as a sample (which will be used to determine the fields of the type) and then pass the file name to the constructor of the generated type to load it. For example, if we had files data/sample_1.rdata to data/sample_10.rdata, we could read them as:

1: 
2: 
3: 
4: 
let means = 
  [ for i in 1 .. 10 ->
      let data = Sample(sprintf "data/sample_%d.rdata" i)
      data.volcanoMean.[0] ]

Note that the default conversions available depend on the plugins that are currently available. For example, when you install the enrie FsLab package with the Deedle library, the RData provider will automatically expose data frames as Deedle Frame<string, string> values.

Passing data from F# to R

If you perform data acquisition in F# and then want to pass the data to R, you can use the standard R functions for saving the *.rdata files. The easiest option is to call the R.assign function to define named values in the R environment and then use R.save to save the environment to a file:

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
9: 
// Calculate sum of square differences
let avg = sample.volcanoList |> Array.average
let sqrs = 
  sample.volcanoList 
  |> Array.map (fun v -> pown (v - avg) 2)

// Save the squares to an RData file
R.assign("volcanoDiffs", sqrs)
R.save(list=[ "volcanoDiffs" ], file="C:/temp/volcano.rdata")

It is recommended to use the list parameter of the save function to specify the names of the symbols that should be saved, rather than saving all symbols. The R provider uses additional temporary symbols and so the saved file would otherwise contain unnecessary fileds.

Once you save the file using the above command, you can re-load it again using the RData type provider, such as: new RData<"C:/temp/volcano.rdata">().

namespace System
namespace RProvider
type Sample = RData<...>

Full name: Reading-rdata.Sample
type RData

Full name: RProvider.RData
val sample : RData<...>

Full name: Reading-rdata.sample
property RData<...>.volcano: RDotNet.SymbolicExpression
property RData<...>.volcanoList: float []
property RData<...>.volcanoMean: float []
val means : float list

Full name: Reading-rdata.means
val i : int
val data : RData<...>
val sprintf : format:Printf.StringFormat<'T> -> 'T

Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.sprintf
val avg : float

Full name: Reading-rdata.avg
type Array =
  member Clone : unit -> obj
  member CopyTo : array:Array * index:int -> unit + 1 overload
  member GetEnumerator : unit -> IEnumerator
  member GetLength : dimension:int -> int
  member GetLongLength : dimension:int -> int64
  member GetLowerBound : dimension:int -> int
  member GetUpperBound : dimension:int -> int
  member GetValue : params indices:int[] -> obj + 7 overloads
  member Initialize : unit -> unit
  member IsFixedSize : bool
  ...

Full name: System.Array
val average : array:'T [] -> 'T (requires member ( + ) and member DivideByInt and member get_Zero)

Full name: Microsoft.FSharp.Collections.Array.average
val sqrs : float []

Full name: Reading-rdata.sqrs
val map : mapping:('T -> 'U) -> array:'T [] -> 'U []

Full name: Microsoft.FSharp.Collections.Array.map
val v : float
val pown : x:'T -> n:int -> 'T (requires member get_One and member ( * ) and member ( / ))

Full name: Microsoft.FSharp.Core.Operators.pown
type R =
  static member ! : ?paramArray: obj [] -> SymbolicExpression + 1 overload
  static member != : ?paramArray: obj [] -> SymbolicExpression + 1 overload
  static member !_hexmode : ?a: obj -> SymbolicExpression + 1 overload
  static member !_octmode : ?a: obj -> SymbolicExpression + 1 overload
  static member $ : ?paramArray: obj [] -> SymbolicExpression + 1 overload
  static member $<- : ?paramArray: obj [] -> SymbolicExpression + 1 overload
  static member $<-_data_frame : ?x: obj * ?name: obj * ?value: obj -> SymbolicExpression + 1 overload
  static member $_DLLInfo : ?x: obj * ?name: obj -> SymbolicExpression + 1 overload
  static member $_data_frame : ?x: obj * ?name: obj -> SymbolicExpression + 1 overload
  static member $_package__version : ?x: obj * ?name: obj -> SymbolicExpression + 1 overload
  ...

Full name: RProvider.R


Base R functions.
R.assign(paramsByName: Collections.Generic.IDictionary<string,obj>) : RDotNet.SymbolicExpression
R.assign(?x: obj, ?value: obj, ?pos: obj, ?envir: obj, ?inherits: obj, ?immediate: obj) : RDotNet.SymbolicExpression


Assign a Value to a Name
R.save(paramsByName: Collections.Generic.IDictionary<string,obj>) : RDotNet.SymbolicExpression
R.save(?___: obj, ?list: obj, ?file: obj, ?ascii: obj, ?version: obj, ?envir: obj, ?compress: obj, ?compression__level: obj, ?eval_promises: obj, ?precheck: obj, ?paramArray: obj []) : RDotNet.SymbolicExpression


Save R Objects
type 'T list = List<'T>

Full name: Microsoft.FSharp.Collections.list<_>
Fork me on GitHub