roboquant avatar

Feed

What is a feed?

A feed is responsible for delivering the data that is needed to run and test strategies. There are no restrictions on the type of data that a Feed can emit. It can be as simple as stock prices or as complex as satellite images. A Feed is used to deliver historic and live data and the interface is the same.

A feed wraps its information in an Event and puts that Event on an EventChannel. This is all done asynchronously from the rest of the trading logic. So while your strategy is processing the previous event and calculating signals, the feed can receive new events.

And although an EventChannel is used to deliver the events, this channel is normally only accessed by internal roboquant logic. A strategy will just be passed the Event object as one it method parameters.

To read more about why roboquant has selected this event-driven approach, read this introduction to event-driven software.

Event

An Event instance contains all the data that is observed at a certain point in Time. It contains the following two attributes

  1. The time attribute when the information within the event became available.

  2. The list of all the Action instances at that time. This can be a mixed list of classes that implement the Action interface. A single Event can contain a mix of prices, satellite images and social media news at the same time.

Events are guaranteed to be delivered in chronological order. A sequential event has a time greater or equal than the previous event.

Time

The time attribute in an event is of the type java.time.Instant. So they can be easily processed and compared without having to worry about time zones. When a feed delivers events, they are always in chronological order.

It is very important that the time in the event reflects when the actions in that same event became available, not when they happened. Typically, there is some delay between when something occurred and when it is published. An event time should reflect the published time.

For example, when you want to use unemployment numbers as part of your feed, you should use the time when they are published, and you have access to them, not the last day of the period they cover. Otherwise, your strategy is looking into the future during back testing and can make decisions based on information it will never have in live trading.

Action

Action is an abstraction for any type of information that can be made available in an Event. An action ranges from price actions, like price-bars and order-book snapshots, to social media content or images.

There is no commonality between actions, and you can add any type of action you like to the platform. The only requirement is that an action implements the empty interface Action. It is up to the strategy to filter for the action types it is interested in.

For example, if your strategy would require PriceBar (candlestick) data, you could use the following pattern to filter this type:

for (priceBar in event.actions.filterIsInstance<PriceBar>()) {
    // your code
}

There are also a few helper methods in the Event class to make access to prices more convenient, since these are of interest to many strategies:

// iterate over all price actions in an event
for (priceAction in event.prices) {
    println(priceAction)
}

// get the first price for a particular asset
val price: Double? = event.getPrice(Asset("XYZ"))

Included Feeds

Roboquant includes several Feed implementations out of the box:

  • CSVFeed that can process CSV files stored somewhere on your local machine.

  • AvroFeed that can process an Avro files stored somewhere on your local machine.

  • RandomWalkFeed that, as the name suggests, generates random price actions for random assets.

  • AggregatorFeed that aggregates price to price-bars.

  • CombinedFeed that combines two or more feeds into a new feed.

  • Integration with third party data providers that make their data available through an API. See also the integration documentation.

The following table shows which out-of-the-box providers support which types of price-actions:

Provider

Type

Historic prices

Live prices

Generic CSV

CSV File

PriceBar, PriceQuote

-

Stooq CSV

File

PriceBar

-

Kraken CSV

File

PriceQuote

-

MT5 CSV

File

PriceBar, PriceQuote

-

Alpaca

API

PriceBar, PriceQuote, TradePrice

PriceBar, PriceQuote, TradePrice

Polygon.io

API

PriceBar

PriceBar, PriceQuote, TradePrice

Interactive Brokers

API

PriceBar

PriceBar

Alpha Vantage

API

PriceBar

-

Binance

API

PriceBar

PriceBar, PriceQuote

XChange

API

-

PriceQuote, TradePrice, OrderBook

Avro

File

PriceBar, PriceQuote, TradePrice, OrderBook

-

Custom Feeds

If you have a data source that is not yet covered by roboquant, you can implement your own Feed. The main method to implement is the play method that will put the available events on the EventChannel.

class MYFeed : Feed {

    val timeFrame: Timeframe
        get() = Timeframe.INFINITE

    override suspend fun play(channel: EventChannel) {
        TODO() // Your code goes here
    }
}

The following implementation of the play method sends 100 empty events (empty meaning an event without any actions included). When the EventChannel is full, the channel.send method will be blocking.

suspend fun play(channel: EventChannel) {
    for (i in 1..100) {
        val actions = emptyList<Action>() // replace with a real list of actions
        val now = Instant.now()
        val event = Event(actions, now)
        channel.send(event)
    }
}

CSV Feed

CSV files are one of the most commonly used file formats for storing historic price data. However, since there is no standard defined, they differ a lot depending on where you downloaded them from. Additionally, they often lack some required meta-data, like the currency of the price or the exchange where they are traded.

The CSVFeed class in roboquant offers several configuration options to cater to these differences, allowing you to parse almost any CSV files found on the internet.

Basic usage

The simplest usage is to provide the directory that contains the CSV files and the CSVFeed will try to parse them based on defaults and the content it finds.

val feed = CSVFeed("data/test")
roboquant.run(feed)
println(feed.assets.summary())

Configuration

Since the format of CSV files differs a lot, the CSVFeed allows for a lot of configuration to cater to this. If you just instantiate a CSVFeed with only a path, it will try to auto-detect the parameters based on the CSV headers and first row it encounters.

A more flexible way to configure the parsing it to provide a custom CSVConfig:

val df = DateTimeFormatter.ISO_DATE_TIME
val config = CSVConfig(
    filePattern = "*.csv",
    fileSkip = listOf("skip_me.csv"),
    hasHeader = false,
    separator = ';',
    priceParser = PriceBarParser(
        autodetect = false,
        open = 1,
        close = 2,
        low = 3,
        high = 4,
        timeSpan = 15.minutes
    ),
    timeParser = { line, asset -> LocalDateTime.parse(line[0], df).atZone(asset.exchange.zoneId).toInstant() },
    assetBuilder = { file -> Asset(file.nameWithoutExtension) },
)
val feed = CSVFeed("somepath", config)

Also, you can define some of the CSVConfig settings in a property file:

file.parse = *.csv
file.skip = ABC.csv XYZ.csv

In order to make it easier to use CSVFeed, there are several popular market data providers of CSV feed available as pre-configured CSVConfig settings:

// CSV files downloaded from stooq.pl
val feed1 = CSVFeed("somepath", CSVConfig.stooq())

// CSV files downloaded from histdata.com
val feed2 = CSVFeed("somepath", CSVConfig.histData())

// CSV files downloaded from MT5
val feed3 = CSVFeed("somepath", CSVConfig.mt5())

// CSV trade files downloaded from Kraken
val feed4 = CSVFeed("somepath", CSVConfig.kraken())

// CSV files downloaded from Yahoo Finance
val feed5 = CSVFeed("somepath", CSVConfig.yahoo())

Memory Constraints

The default CSVFeed loads all the data found in all the files into memory during initialization. The benefit is that after initialization, the CSVFeed is very fast when used in back tests. And since you can reuse the feed in parallel back-tests, this becomes even more important when doing extensive back testing.

However, a downside is the high-memory usage when working in a memory-constrained environment. So there is also a CSV Feed implementation that only reads the data when required. This is substantially slower, but consumes a lot less memory. It supports the same configuration; the only difference is that you need to use LazyCSVFeed instead of the regular CSVFeed

val feed = LazyCSVFeed("data/test")
roboquant.run(feed)
Convert the CSV files to an AvroFeed, that way you have the best of both worlds: low memory consumption and high performance. See Convert to an AvroFeed on how to do that.

AggregatorFeed

The aggregator feed enables you to aggregate PriceActions from another feed to higher level resolution PriceBars.

The AggregatorFeed supports all included types of PriceActions (PriceBar, TradePrice, PriceQuote, OrderBook). So for example, you can convert tick prices to price-bars.

The following example assumes the CSVFeed contains 1-minute price-bars, and you want to have 15-minute price-bars.

val oneMinuteFeed = CSVFeed("data/test")
val fifteenMinuteFeed = AggregatorFeed(oneMinuteFeed, 15.minutes)

You can also aggregate live feeds. For example, aggregate live quotes from the Binance exchange into one-minute price-bars.

val feed = BinanceLiveFeed()
feed.subscribePriceQuote("BTCBUSD", "ETHBUSD")

val priceBarFeed = AggregatorFeed(feed, 1.minutes)

CombinedFeed

You can only provide a single feed instance to the Roboquant.run method. However, it is easy to combine multiple feeds into a single one before invoking the run method:

val feed1 = CSVFeed("data/test1")
val feed2 = CSVFeed("data/test2")
val feed = CombinedFeed(feed1, feed2)

roboquant.run(feed)

You can combine any number of arbitrary feeds.

AvroFeed

Apache Avro™ is a data serialization system. Avro provides a compact, fast, binary data format. Avro file format is a row-based repository, making a good match for event-driven architectures like roboquant.

Avro files make it easy to store large amount of data in a single file and efficiently use it during back testing. The AvroFeed implementation of roboquant only loads prices in memory when it is required, so very large datasets can be used without having to worry about memory issues. Also since an AvroFeed uses a single file to store its prices, it is easier to distribute it than for example many directories full of CSV files.

As you might have guessed, the use of the AvroFeed comes highly recommended. It is fast and scales well with large sets of data.

Using an AvroFeed

Using any compatible Avro file as a feed is straightforward. You just provide the file location to the constructor as a parameter, and the feed will be ready to use.

val feed = AvroFeed("my_file.avro")
roboquant.run(feed)

A single feed can contain mixed price actions if required, so you can have Trade Prices and Price Bars in the same feed.

Convert to an AvroFeed

You can also convert other feeds into an AvroFeed that you can later use for back-testing. This works with both historic data feeds and live data feeds. The current limitation is that it only works with price actions and not other actions like image data or social media posts.

val fileName = "my_file.avro"

val feed = CSVFeed("some/path/")
AvroFeed.record(feed, fileName)

// Append to existing avro file
val feed2 = CSVFeed("some/path2/")
AvroFeed.record(feed2, fileName, append = true)

// Later you can use the same file in a AvroFeed
val feed3 = AvroFeed(fileName)
If you work a lot with directories full of CSV files, you can convert them to a single Avro file and use that from then on (using the above code snippet as an example). This is both faster and has lower memory consumption.

The nice thing is that the recording also works with live feeds. So you can record a live feed for a specified timeframe and then use it afterwards in your back tests.

The following example shows how you can record 4-hours worth of Bitcoin price-bars from the Binance exchange.

val feed = BinanceLiveFeed()
feed.subscribePriceBar("BTCUSDT")
var timeframe = Timeframe.next(4.hours)
AvroFeed.record(feed, "bitcoin.avro", timeframe = timeframe)

// You can also append to an existing Avro file
timeframe = Timeframe.next(8.hours)
AvroFeed.record(feed, "bitcoin.avro", timeframe = timeframe, append = true)

Default AvroFeeds

Roboquant comes with a few default AvroFeeds. The main benefit is that you can quickly test if your code is working without having to download or set up your own feeds. There are three feeds available:

  1. A feed with daily PriceBar market data for the S&P 500 stocks for the last few years

  2. A feed with a few minutes of price quotes, again for the S&P 500 stocks

  3. A feed with 1-minute forex data

val feed1 = AvroFeed.sp500() // S&P500 with daily PriceBar data
val feed2 = AvroFeed.sp500Quotes() // S&P500 with PriceQuote data
val feed3 = AvroFeed.forex() // EUR/USD forex data

In the future, there will be more default AvroFeeds that you can use, depending on the feedback from the community. Likely candidates that are currently missing are FOREX feeds and cryptocurrency feeds.

These default feeds should not be relied on for serious back testing. They just enable you to quickly run a back test and see if your strategy and policy are behaving as expected.

QuestDB Feed

QuestDBFeed and QuestDBRecorder are similar to AvroFeed. But in this case, the data will be stored in a QuestDB database.

The QuestDBFeed has the following characteristics:

  1. Support for very large data sets.

  2. Very fast read and write operations, with random access on timestamp.

  3. Support micro-second timestamp resolution.

  4. Supports only a single PriceAction type per feed. So a single feed can contain price-bars or trade-prices, but not both.

  5. Although no compression, efficient storage since good support for repeating strings.

  6. You can append new entries with overlapping time to an existing QuestDB feed.

  7. Support for PriceBar, PriceQuote and TradePrice.

Example how to create a QuestDBFeed based on some other feed (like a CSVFeed). This will use the default database.

val recorder = QuestDBRecorder()
recorder.record<PriceBar>(inputFeed, "pricebars")

val feed = QuestDBFeed("pricebars")

Example how to append data to an existing feed:

val recorder = QuestDBRecorder()

// If we want later to add out-of-order entries, we need to partition the table
recorder.record<PriceBar>(inputFeed1, "pricebars", partition = QuestDBRecorder.Partition.YEAR)

// Add the out-of-order pricebars
recorder.record<PriceBar>(inputFeed2, "pricebars", append = true)

val feed = QuestDBFeed("pricebars")

You can also access your database with a web-viewer. Accessing the default price database:

docker run --rm -p 9000:9000 -v "$HOME/.roboquant/questdb-prices:/var/lib/questdb" questdb/questdb:7.3.1