F# Frame extensions

Namespace: Deedle

This module contains F# functions and extensions for working with frames. This includes operations for creating frames such as the frame function, => operator and Frame.ofRows, Frame.ofColumns and Frame.ofRowKeys functions. The module also provides additional F# extension methods including ReadCsv, SaveCsv and PivotTable.

Frame construction
Frame operations
Input and output

Frame construction

The functions and methods in this group can be used to create frames. If you are creating a frame from a number of sample values, you can use frame and the => operator (or the =?> opreator which is useful if you have multiple series of distinct types):

1: 
2:

frame [ "Column 1" => series [ 1 => 1.0; 2 => 2.0 ]
        "Column 2" => series [ 3 => 3.0 ] ]

Aside from this, the various type extensions let you write Frame.ofXyz to construct frames from data in various formats - Frame.ofRows and Frame.ofColumns create frame from a series or a sequence of rows or columns; Frame.ofRecords creates a frame from .NET objects using Reflection and Frame.ofRowKeys creates an empty frame with the specified keys.

Functions and values

Function or value Description


            ( =?> ) a b

Signature: a:'?497402 -> b:ISeries<'?497403> -> '?497402 * ISeries<'?497403>
Type parameters: '?497402, '?497403

Custom operator that can be used when constructing a frame from observations of series. The operator simply returns a tuple, but it upcasts the series argument so you don't have to do manual casting. For example:

1:	`frame [ "k1" =?> series [0 => "a"]; "k2" =?> series ["x" => "y"] ]`


            ( => ) a b

Signature: a:'?497399 -> b:'?497400 -> '?497399 * '?497400
Type parameters: '?497399, '?497400

Custom operator that can be used when constructing series from observations or frames from key-row or key-column pairs. The operator simply returns a tuple, but it provides a more convenient syntax. For example:

1:	`series [ "k1" => 1; "k2" => 15 ]`


            frame columns

Signature: columns:seq<'?497405 * '?497406> -> Frame<'?497407,'?497405>
Type parameters: '?497405, '?497406, '?497407

A function for constructing data frame from a sequence of name - column pairs. This provides a nicer syntactic sugar for Frame.ofColumns.

Example

To create a simple frame with two columns, you can write:

1: 
2:

frame [ "A" => series [ 1 => 30.0; 2 => 35.0 ]
        "B" => series [ 1 => 30.0; 3 => 40.0 ] ]

Type extensions

Type extension	Description
`ofArray2D(array)` Signature: (array:'T [,]) -> Frame<int,int> Type parameters: 'T	Create data frame from a 2D array of values. The first dimension of the array is used as rows and the second dimension is treated as columns. Rows and columns of the returned frame are indexed with the element's offset in the array. Parameters `array` - A two-dimensional array to be converted into a data frame
`ofColumns(cols)` Signature: cols:Series<'C,'?497429> -> Frame<'R,'C> Type parameters: 'C, '?497429, 'R	Creates a frame from a series that maps column keys to a nested series containing values for each column.
`ofColumns(cols)` Signature: (cols:seq<'C * '?497433>) -> Frame<'R,'C> Type parameters: 'C, '?497433, 'R	Creates a frame from a sequence of column keys and column series pairs. The column series can contain values of any type, but it has to be the same for all the series - if you have heterogenously typed series, use `=?>`.
`ofRecords(series)` Signature: series:Series<'K,'R> -> Frame<'K,string> Type parameters: 'K, 'R	Creates a data frame from a series containing any .NET objects. The method uses reflection over the specified type parameter `'T` and turns its properties to columns.
`ofRecords(values)` Signature: values:seq<'T> -> Frame<int,string> Type parameters: 'T	Creates a data frame from a sequence of any .NET objects. The method uses reflection over the specified type parameter `'T` and turns its properties to columns.
`ofRecords(values, indexCol)` Signature: (values:IEnumerable * indexCol:string) -> Frame<'R,string> Type parameters: 'R	Creates a data frame from a sequence of any .NET objects. The method uses reflection over the specified type parameter `'T` and turns its properties to columns.
`ofRowKeys(keys)` Signature: keys:seq<'R> -> Frame<'R,string> Type parameters: 'R	Creates a frame with the specified row keys, but no columns (and no data). This is useful if you want to build a frame gradually and restrict all the later added data to a sequence of row keys known in advance.
`ofRows(rows)` Signature: (rows:seq<'R * '?497419>) -> Frame<'R,'C> Type parameters: 'R, '?497419, 'C	Creates a frame from a sequence of row keys and row series pairs. The row series can contain values of any type, but it has to be the same for all the series - if you have heterogenously typed series, use `=?>`.
`ofRows(rows)` Signature: rows:Series<'R,'?497423> -> Frame<'R,'C> Type parameters: 'R, '?497423, 'C	Creates a frame from a series that maps row keys to a nested series containing values for each row.
`ofRowsOrdinal(rows)` Signature: rows:seq<'?497414> -> Frame<int,'K> Type parameters: '?497414, 'K, 'V	Creates a frame with ordinal Integer index from a sequence of rows. The column indices of individual rows are unioned, so if a row has fewer columns, it will be successfully added, but there will be missing values.
`ofValues(values)` Signature: (values:seq<'R * 'C * 'V>) -> Frame<'R,'C> Type parameters: 'R, 'C, 'V	Create a data frame from a sequence of tuples containing row key, column key and a value.

Frame operations

The group contains two overloads of the F#-friendly version of the PivotTable method.

Type extensions

Type extension Description

Type extension	Description
`PivotTable(r, c, op)` Signature: (r:'TColumnKey * c:'TColumnKey * op:(Frame<'TRowKey,'TColumnKey> -> 'T)) -> Frame<'R,'C> Type parameters: 'R, 'C, 'T	Creates a new data frame resulting from a 'pivot' operation. Consider a denormalized data frame representing a table: column labels are field names & table values are observations of those fields. pivotTable buckets the rows along two axes, according to the values of the columns `r` and `c`; and then computes a value for the frame of rows that land in each bucket. Parameters `r` - A column key to group on for the resulting row index `c` - A column key to group on for the resulting col index `op` - A function computing a value from the corresponding bucket frame


            PivotTable(r, c, op)

Signature: (r:'TColumnKey * c:'TColumnKey * op:(Frame<'TRowKey,'TColumnKey> -> 'T)) -> Frame<'R,'C>
Type parameters: 'R, 'C, 'T

Creates a new data frame resulting from a 'pivot' operation. Consider a denormalized data frame representing a table: column labels are field names & table values are observations of those fields. pivotTable buckets the rows along two axes, according to the values of the columns r and c; and then computes a value for the frame of rows that land in each bucket.

Parameters

r - A column key to group on for the resulting row index
c - A column key to group on for the resulting col index
op - A function computing a value from the corresponding bucket frame

Input and output

This group of extensions includes a number of overloads for the ReadCsv and SaveCsv methods. The methods here are designed to be used from F# and so they are F#-style extensions and they use F#-style optional arguments. In general, the overlads take either a path or TextReader/TextWriter. Also note that ReadCsv<'R>(path, indexCol, ...) lets you specify the column to be used as the index.

Type extensions

Type extension	Description
`ReadCsv(...)` Signature: (path:string * indexCol:string * hasHeaders:bool option * inferTypes:bool option * inferRows:int option * schema:string option * separators:string option * culture:string option * maxRows:int option * missingValues:string [] option) -> Frame<'R,string> Type parameters: 'R	Load data frame from a CSV file. The operation automatically reads column names from the CSV file (if they are present) and infers the type of values for each column. Columns of primitive types (`int`, `float`, etc.) are converted to the right type. Columns of other types (such as dates) are not converted automatically. Parameters `path` - Specifies a file name or an web location of the resource. `indexCol` - Specifies the column that should be used as an index in the resulting frame. The type is specified via a type parameter, e.g. use `Frame.ReadCsv<int>("file.csv", indexCol="Day")`. `hasHeaders` - Specifies whether the input CSV file has header row `inferTypes` - Specifies whether the method should attempt to infer types of columns automatically (set this to `false` if you want to specify schema) `inferRows` - If `inferTypes=true`, this parameter specifies the number of rows to use for type inference. The default value is 0, meaninig all rows. `schema` - A string that specifies CSV schema. See the documentation for information about the schema format. `separators` - A string that specifies one or more (single character) separators that are used to separate columns in the CSV file. Use for example `";"` to parse semicolon separated files. `culture` - Specifies the name of the culture that is used when parsing values in the CSV file (such as `"en-US"`). The default is invariant culture. `maxRows` - The maximal number of rows that should be read from the CSV file. `missingValues` - An array of strings that contains values which should be treated as missing when reading the file. The default value is: "NaN"; "NA"; "#N/A"; ":"; "-"; "TBA"; "TBD".
`ReadCsv(...)` Signature: (path:string * hasHeaders:bool option * inferTypes:bool option * inferRows:int option * schema:string option * separators:string option * culture:string option * maxRows:int option * missingValues:string [] option) -> Frame<int,string>	Load data frame from a CSV file. The operation automatically reads column names from the CSV file (if they are present) and infers the type of values for each column. Columns of primitive types (`int`, `float`, etc.) are converted to the right type. Columns of other types (such as dates) are not converted automatically. Parameters `path` - Specifies a file name or an web location of the resource. `hasHeaders` - Specifies whether the input CSV file has header row `inferTypes` - Specifies whether the method should attempt to infer types of columns automatically (set this to `false` if you want to specify schema) `inferRows` - If `inferTypes=true`, this parameter specifies the number of rows to use for type inference. The default value is 100. `schema` - A string that specifies CSV schema. See the documentation for information about the schema format. `separators` - A string that specifies one or more (single character) separators that are used to separate columns in the CSV file. Use for example `";"` to parse semicolon separated files. `culture` - Specifies the name of the culture that is used when parsing values in the CSV file (such as `"en-US"`). The default is invariant culture. `maxRows` - The maximal number of rows that should be read from the CSV file. `missingValues` - An array of strings that contains values which should be treated as missing when reading the file. The default value is: "NaN"; "NA"; "#N/A"; ":"; "-"; "TBA"; "TBD".
`ReadCsv(...)` Signature: (stream:Stream * hasHeaders:bool option * inferTypes:bool option * inferRows:int option * schema:string option * separators:string option * culture:string option * maxRows:int option * missingValues:string [] option) -> Frame<int,string>	Load data frame from a CSV file. The operation automatically reads column names from the CSV file (if they are present) and infers the type of values for each column. Columns of primitive types (`int`, `float`, etc.) are converted to the right type. Columns of other types (such as dates) are not converted automatically. Parameters `stream` - Specifies the input stream, opened at the beginning of CSV data `hasHeaders` - Specifies whether the input CSV file has header row `inferTypes` - Specifies whether the method should attempt to infer types of columns automatically (set this to `false` if you want to specify schema) `inferRows` - If `inferTypes=true`, this parameter specifies the number of rows to use for type inference. The default value is 100. `schema` - A string that specifies CSV schema. See the documentation for information about the schema format. `separators` - A string that specifies one or more (single character) separators that are used to separate columns in the CSV file. Use for example `";"` to parse semicolon separated files. `culture` - Specifies the name of the culture that is used when parsing values in the CSV file (such as `"en-US"`). The default is invariant culture. `maxRows` - The maximal number of rows that should be read from the CSV file. `missingValues` - An array of strings that contains values which should be treated as missing when reading the file. The default value is: "NaN"; "NA"; "#N/A"; ":"; "-"; "TBA"; "TBD".
`ReadCsv(...)` Signature: (reader:TextReader * hasHeaders:bool option * inferTypes:bool option * inferRows:int option * schema:string option * separators:string option * culture:string option * maxRows:int option * missingValues:string [] option) -> Frame<int,string>	Load data frame from a CSV file. The operation automatically reads column names from the CSV file (if they are present) and infers the type of values for each column. Columns of primitive types (`int`, `float`, etc.) are converted to the right type. Columns of other types (such as dates) are not converted automatically. Parameters `reader` - Specifies the `TextReader`, positioned at the beginning of CSV data `hasHeaders` - Specifies whether the input CSV file has header row `inferTypes` - Specifies whether the method should attempt to infer types of columns automatically (set this to `false` if you want to specify schema) `inferRows` - If `inferTypes=true`, this parameter specifies the number of rows to use for type inference. The default value is 100. `schema` - A string that specifies CSV schema. See the documentation for information about the schema format. `separators` - A string that specifies one or more (single character) separators that are used to separate columns in the CSV file. Use for example `";"` to parse semicolon separated files. `culture` - Specifies the name of the culture that is used when parsing values in the CSV file (such as `"en-US"`). The default is invariant culture. `maxRows` - The maximal number of rows that should be read from the CSV file. `missingValues` - An array of strings that contains values which should be treated as missing when reading the file. The default value is: "NaN"; "NA"; "#N/A"; ":"; "-"; "TBA"; "TBD".
`SaveCsv(...)` Signature: (writer:TextWriter * includeRowKeys:bool option * keyNames:seq<string> option * separator:char option * culture:CultureInfo option) -> unit	Save data frame to a CSV file or a `TextWriter`. When calling the operation, you can specify whether you want to save the row keys or not (and headers for the keys) and you can also specify the separator (use `\t` for writing TSV files). When specifying file name ending with `.tsv`, the `\t` separator is used automatically. Parameters `writer` - Specifies the TextWriter to which the CSV data should be written `includeRowKeys` - When set to `true`, the row key is also written to the output file `keyNames` - Can be used to specify the CSV headers for row key (or keys, for multi-level index) `separator` - Specify the column separator in the file (the default is `\t` for TSV files and `,` for CSV files) `culture` - Specify the `CultureInfo` object used for formatting numerical data
`SaveCsv(...)` Signature: (path:string * includeRowKeys:bool option * keyNames:seq<string> option * separator:char option * culture:CultureInfo option) -> unit	Save data frame to a CSV file or a `TextWriter`. When calling the operation, you can specify whether you want to save the row keys or not (and headers for the keys) and you can also specify the separator (use `\t` for writing TSV files). When specifying file name ending with `.tsv`, the `\t` separator is used automatically. Parameters `path` - Specifies the output file name where the CSV data should be written `includeRowKeys` - When set to `true`, the row key is also written to the output file `keyNames` - Can be used to specify the CSV headers for row key (or keys, for multi-level index) `separator` - Specify the column separator in the file (the default is `\t` for TSV files and `,` for CSV files) `culture` - Specify the `CultureInfo` object used for formatting numerical data
`SaveCsv(path, keyNames)` Signature: (path:string * keyNames:seq<string>) -> unit	Save data frame to a CSV file or to a `TextWriter`. When calling the operation, you can specify whether you want to save the row keys or not (and headers for the keys) and you can also specify the separator (use `\t` for writing TSV files). When specifying file name ending with `.tsv`, the `\t` separator is used automatically. Parameters `path` - Specifies the output file name where the CSV data should be written `keyNames` - Specifies the CSV headers for row key (or keys, for multi-level index) `separator` - Specify the column separator in the file (the default is `\t` for TSV files and `,` for CSV files) `culture` - Specify the `CultureInfo` object used for formatting numerical data
`ToDataTable(rowKeyNames)` Signature: rowKeyNames:seq<string> -> DataTable	Returns the data of the frame as a .NET `DataTable` object. The column keys are automatically converted to strings that are used as column names. The row index is turned into an additional column with the specified name (the function takes the name as a sequence to support hierarchical keys, but typically you can write just `frame.ToDataTable(["KeyName"])`. Parameters `rowKeyNames` - Specifies the names of the row key components (or just a single row key name if the row index is not hierarchical).

Deedle

F# Frame extensions

Table of contents

Frame construction

Functions and values

Example

Type extensions

Parameters

Frame operations

Type extensions

Parameters

Input and output

Type extensions

Parameters

Parameters

Parameters

Parameters

Parameters

Parameters

Parameters

Parameters