### Deedle

When loading data from an external data source (such as a database), you might want to create a virtual time series that represents the data source, but does not actually load the data until needed. If you apply some range restriction (like slicing) to the data series before using the values, then it is not necessary to load the entire data set into memory.

Deedle supports lazy loading through the DelayedSeries.FromValueLoader method. It returns an ordinary data series of type Series<K, V> which has a delayed internal representation.

## Creating lazy series

We will not use a real database in this tutorial, but let's say that you have the following function which loads data for a given day range:

  1: 2: 3: 4: 5: 6: 7: 8: 9: 10:  open Deedle /// Given a time range, generates random values for dates (at 12:00 AM) /// starting with the day of the first date time and ending with the /// day after the second date time (to make sure they are in range) let generate (low:DateTime) (high:DateTime) = let rnd = Random() let days = int (high.Date - low.Date).TotalDays + 1 seq { for d in 0 .. days -> KeyValue.Create(low.Date.AddDays(float d), rnd.Next()) } 

Using random numbers as the source in this example is not entirely correct, because it means that we will get different values each time a new sub-range of the series is required - but it will suffice for the demonstration.

Now, to create a lazily loaded series, we need to open the Indices namespace, specify the minimal and maximal value of the series and use DelayedSeries.FromValueLoader:

 1: 2: 3: 4: 5: 6: 7: 8: 9:  open Deedle.Indices // Minimal and maximal values that can be loaded from the series let min, max = DateTime(2010, 1, 1), DateTime(2013, 1, 1) // Create a lazy series for the given range let ls = DelayedSeries.FromValueLoader(min, max, fun (lo, lob) (hi, hib) -> async { printfn "Query: %A - %A" (lo, lob) (hi, hib) return generate lo hi }) 

To make the diagnostics easier, we print the required range whenever a request is made. After running this code, you should not see any output yet. The parameter to DelayedSeries.FromValueLoader is a function that takes 4 arguments:

• lo and hi specify the low and high boundaries of the range. Their type is the type of the key (e.g. DateTime in our example)
• lob and hib are values of type BoundaryBehavior and can be either Inclusive or Exclusive. They specify whether the boundary value should be included or not.

Our sample function does not handle boundaries correctly - it always includes the boundary (and possibly more values). This is not a problem, because the lazy loader automatically skips over such values. But if you want, you can use lob and hib parameters to build a more optimal SQL query.

## Using un-evaluated series

Let's now have a look at the operations that we can perform on un-evaluated series. Any operation that actually accesses values or keys of the series (such as Series.observations or lookup for a specific key) will force the evaluation of the series.

However, we can use range restrictions before accessing the data:

  1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13:  // Get series representing January 2012 let jan12 = ls.[DateTime(2012, 1, 1) .. DateTime(2012, 2, 1)] // Further restriction - only first half of the month let janHalf = jan12.[.. DateTime(2012, 1, 15)] // Get value for a specific date janHalf.[DateTime(2012, 1, 1)] Query: (1/1/2012, Inclusive) - (1/15/2012, Inclusive) val it : int = 1127670994 janHalf.[DateTime(2012, 1, 2)] val it : int = 560920727 

As you can see from the output on line 9, the series obtained data for the 15 day range that we created by restricting the original series. When we requested another value within the specified range, it was already available and it was returned immediately. Note that janHalf is restricted to the specified 15 day range, so we cannot access values outside of the range. Also, when you access a single value, entire series is loaded. The motivation is that you probably need to access multiple values, so it is likely cheaper to load the whole series.

Another operation that can be performed on an unevaluated series is to add it to a data frame with some existing key range:

 1: 2: 3: 4: 5: 6:  // Create empty data frame for days of December 2011 let dec11 = Frame.ofRowKeys [ for d in 1 .. 31 -> DateTime(2011, 12, d) ] // Add series as the 'Values' column to the data frame dec11?Values <- ls Query: (12/1/2011, Inclusive) - (12/31/2011, Inclusive) 

When adding lazy series to a data frame, the series has to be evaluated (so that the values can be properly aligned) but it is first restricted to the range of the data frame. In the above example, only one month of data is loaded.

namespace System
namespace Deedle
val generate : low:DateTime -> high:DateTime -> seq<Collections.Generic.KeyValuePair<DateTime,int>>

Full name: Lazysource.generate

Given a time range, generates random values for dates (at 12:00 AM)
starting with the day of the first date time and ending with the
day after the second date time (to make sure they are in range)
val low : DateTime
Multiple items
type DateTime =
struct
new : ticks:int64 -> DateTime + 10 overloads
member Add : value:TimeSpan -> DateTime
member AddDays : value:float -> DateTime
member AddHours : value:float -> DateTime
member AddMilliseconds : value:float -> DateTime
member AddMinutes : value:float -> DateTime
member AddMonths : months:int -> DateTime
member AddSeconds : value:float -> DateTime
member AddTicks : value:int64 -> DateTime
member AddYears : value:int -> DateTime
...
end

Full name: System.DateTime

--------------------
DateTime()
DateTime(ticks: int64) : unit
DateTime(ticks: int64, kind: DateTimeKind) : unit
DateTime(year: int, month: int, day: int) : unit
DateTime(year: int, month: int, day: int, calendar: Globalization.Calendar) : unit
DateTime(year: int, month: int, day: int, hour: int, minute: int, second: int) : unit
DateTime(year: int, month: int, day: int, hour: int, minute: int, second: int, kind: DateTimeKind) : unit
DateTime(year: int, month: int, day: int, hour: int, minute: int, second: int, calendar: Globalization.Calendar) : unit
DateTime(year: int, month: int, day: int, hour: int, minute: int, second: int, millisecond: int) : unit
DateTime(year: int, month: int, day: int, hour: int, minute: int, second: int, millisecond: int, kind: DateTimeKind) : unit
val high : DateTime
val rnd : Random
Multiple items
type Random =
new : unit -> Random + 1 overload
member Next : unit -> int + 2 overloads
member NextBytes : buffer:byte[] -> unit
member NextDouble : unit -> float

Full name: System.Random

--------------------
Random() : unit
Random(Seed: int) : unit
val days : int
Multiple items
val int : value:'T -> int (requires member op_Explicit)

Full name: Microsoft.FSharp.Core.Operators.int

--------------------
type int = int32

Full name: Microsoft.FSharp.Core.int

--------------------
type int<'Measure> = int

Full name: Microsoft.FSharp.Core.int<_>
property DateTime.Date: DateTime
Multiple items
val seq : sequence:seq<'T> -> seq<'T>

Full name: Microsoft.FSharp.Core.Operators.seq

--------------------
type seq<'T> = Collections.Generic.IEnumerable<'T>

Full name: Microsoft.FSharp.Collections.seq<_>
val d : int
Multiple items
active recognizer KeyValue: Collections.Generic.KeyValuePair<'Key,'Value> -> 'Key * 'Value

Full name: Microsoft.FSharp.Core.Operators.( |KeyValue| )

--------------------
type KeyValue =
static member Create : key:'K * value:'V -> KeyValuePair<'K,'V>

Full name: Deedle.KeyValue
static member KeyValue.Create : key:'K * value:'V -> Collections.Generic.KeyValuePair<'K,'V>
Multiple items
val float : value:'T -> float (requires member op_Explicit)

Full name: Microsoft.FSharp.Core.Operators.float

--------------------
type float = Double

Full name: Microsoft.FSharp.Core.float

--------------------
type float<'Measure> = float

Full name: Microsoft.FSharp.Core.float<_>
Random.Next() : int
Random.Next(maxValue: int) : int
Random.Next(minValue: int, maxValue: int) : int
namespace Deedle.Indices
val min : DateTime

Full name: Lazysource.min
val max : DateTime

Full name: Lazysource.max
val ls : Series<DateTime,int>

Full name: Lazysource.ls
type DelayedSeries =
static member Create : min:'a0 * max:'a0 * loader:Func<'a0,BoundaryBehavior,'a0,BoundaryBehavior,Task<seq<KeyValuePair<'a0,'a1>>>> -> Series<'a0,'a1> (requires comparison)
static member Create : min:'a0 * max:'a0 * loader:('a0 * BoundaryBehavior -> 'a0 * BoundaryBehavior -> Async<seq<KeyValuePair<'a0,'a1>>>) -> Series<'a0,'a1> (requires comparison)
static member FromIndexVectorLoader : scheme:IAddressingScheme * vectorBuilder:IVectorBuilder * indexBuilder:IIndexBuilder * min:'K * max:'K * loader:('K * BoundaryBehavior -> 'K * BoundaryBehavior -> Async<IIndex<'K> * IVector<'V>>) -> Series<'K,'V> (requires equality)
static member FromValueLoader : min:'K * max:'K * loader:('K * BoundaryBehavior -> 'K * BoundaryBehavior -> Async<seq<KeyValuePair<'K,'V>>>) -> Series<'K,'V> (requires comparison)

Full name: Deedle.DelayedSeries
static member DelayedSeries.FromValueLoader : min:'K * max:'K * loader:('K * BoundaryBehavior -> 'K * BoundaryBehavior -> Async<seq<Collections.Generic.KeyValuePair<'K,'V>>>) -> Series<'K,'V> (requires comparison)
val lo : DateTime
val lob : BoundaryBehavior
val hi : DateTime
val hib : BoundaryBehavior
val async : AsyncBuilder

Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.async
val printfn : format:Printf.TextWriterFormat<'T> -> 'T

Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.printfn
val jan12 : Series<DateTime,int>

Full name: Lazysource.jan12
val janHalf : Series<DateTime,int>

Full name: Lazysource.janHalf
val dec11 : Frame<DateTime,string>

Full name: Lazysource.dec11
Multiple items
module Frame

from Deedle

--------------------
type Frame =
static member CreateEmpty : unit -> Frame<'R,'C> (requires equality and equality)
static member FromArray2D : array:'T [,] -> Frame<int,int>
static member FromColumns : cols:Series<'TColKey,Series<'TRowKey,'V>> -> Frame<'TRowKey,'TColKey> (requires equality and equality)
static member FromColumns : cols:Series<'TColKey,ObjectSeries<'TRowKey>> -> Frame<'TRowKey,'TColKey> (requires equality and equality)
static member FromColumns : columns:seq<KeyValuePair<'ColKey,ObjectSeries<'RowKey>>> -> Frame<'RowKey,'ColKey> (requires equality and equality)
static member FromColumns : columns:seq<KeyValuePair<'ColKey,Series<'RowKey,'V>>> -> Frame<'RowKey,'ColKey> (requires equality and equality)
static member FromColumns : cols:seq<Series<'ColKey,'V>> -> Frame<'ColKey,int> (requires equality)
static member FromRecords : values:seq<'T> -> Frame<int,string>
static member FromRecords : series:Series<'K,'R> -> Frame<'K,string> (requires equality)
static member FromRowKeys : keys:seq<'K> -> Frame<'K,string> (requires equality)
...

Full name: Deedle.Frame

--------------------
type Frame<'TRowKey,'TColumnKey (requires equality and equality)> =
interface IDynamicMetaObjectProvider
interface INotifyCollectionChanged
interface IFsiFormattable
interface IFrame
new : names:seq<'TColumnKey> * columns:seq<ISeries<'TRowKey>> -> Frame<'TRowKey,'TColumnKey>
new : rowIndex:IIndex<'TRowKey> * columnIndex:IIndex<'TColumnKey> * data:IVector<IVector> * indexBuilder:IIndexBuilder * vectorBuilder:IVectorBuilder -> Frame<'TRowKey,'TColumnKey>
member AddColumn : column:'TColumnKey * series:ISeries<'TRowKey> -> unit
member AddColumn : column:'TColumnKey * series:seq<'V> -> unit
member AddColumn : column:'TColumnKey * series:ISeries<'TRowKey> * lookup:Lookup -> unit
member AddColumn : column:'TColumnKey * series:seq<'V> * lookup:Lookup -> unit
...

Full name: Deedle.Frame<_,_>

--------------------
new : names:seq<'TColumnKey> * columns:seq<ISeries<'TRowKey>> -> Frame<'TRowKey,'TColumnKey>
new : rowIndex:IIndex<'TRowKey> * columnIndex:IIndex<'TColumnKey> * data:IVector<IVector> * indexBuilder:IIndexBuilder * vectorBuilder:Vectors.IVectorBuilder -> Frame<'TRowKey,'TColumnKey>
static member Frame.ofRowKeys : keys:seq<'R> -> Frame<'R,string> (requires equality)