Feeding a real-time user interface

Vita Smid | EuroPython 2017

July 11, 2017

Hi, I am Vita.

I am a software engineer specializing in difficult, mathy problems.

Quantlane

  • We develop and run a stock trading platform and trading strategies.
  • Small team, lean principles.
  • All back-end code is Python 3.5 / 3.6.

Example 1

Trades

Time Price Quantity
1499530305.593857 23.45 500
1499530305.649646 23.46 323
1499530306.024135 23.46 107
1499530307.153155 23.45 1,300

Example 2

Open orders in the market

Side Price Quantity Status
Buy 23.40 1,500
new
pending cancel…
Sell
23.95
24.00
900 new
Buy 23.30 2,000 creating…

Take 1

Naïve snapshots

  • Give each data producer in the platform a get_state method.
  • Call all get_state methods periodically.
  • Encode their return values in JSON and send them to clients.

Take 2

Diffing snapshots

  • Keep calling the get_state methods.
  • Instead of sending their return values to clients, you send incremental updates – diffs.
  • You remember the last state seen by clients and compare new state to it to generate diffs.

docs.python.org/3/library/difflib.html


import difflib
x = ('one', 'two', 'three')
y = ('two', 'two point five', 'four')
matcher = difflib.SequenceMatcher(None, x, y)
matcher.get_opcodes()

[('delete', 0, 1, 0, 0), ('equal', 1, 2, 0, 1),
 ('replace', 2, 3, 1, 3)]

Why this is difficult #1

  • When a new client connects, you still have to send them a snapshot to get them started.
  • You might end up doing a lot of unnecessary work in your get_state methods…
  • …so your data producers should also set and clear a has_changed flag.

Why this is difficult #2

  • Difflib only works with sequences of hashable items.
  • Therefore, you must have a canonical hashable representation of every piece of state you want to send to the client.
  • For small sequences it’s faster to just send a snapshot 🙄

Take 3

Generating diffs on every state write

  • Every time you call…
    orders.insert(123, new_order)
  • …something somewhere remembers that
    ('insert', 123, new_order) happened.
  • Same goes for orders[123] = updated_order and del orders[123]

github.com/qntln/difftrack


import difftrack
orders = difftrack.ListDispatcher()
listener = difftrack.ListListener()
orders.add_listener(listener)

orders.insert(0, Order(side=BUY, price=23.95, quantity=500))
orders.insert(1, Order(side=SELL, price=24.30, quantity=100))
del orders[0]
print(listener.get_new_diffs())

[(<ListDiff.INSERT>, 0, Order(side=BUY, price=23.95, ...)),
 (<ListDiff.INSERT>, 1, Order(side=SELL, price=24.30, ...)),
 (<ListDiff.DELETE>, 0, None)]

Snapshots are automatically supported


>>> listener.get_snapshot()
[Order(side=SELL, price=24.30, quantity=100)]

Old diffs are dropped


>>> listener.get_new_diffs()
[]

There is more…

  • You can also track dictionary diffs.
  • You can compact diffs that cancel each other out.
    (coming soon…)
  • You can squash (aggregate) diffs affecting subsequent indices.

Beyond diffs

  • Consider using a custom binary protocol to send updates to clients.
  • We use Apache Avro to encode payloads.
  • Each message type has a schema which also serves as documentation.

Inspiration


Thank you

quantlane.com