Twitter popularized the term "firehose API", to mean a realtime stream of data sent through a persistent connection. But even if you're not a realtime service, streaming APIs are great for pushing data from the backend to clients. They reduce resource usage because the server can decide when it's a good time to send a incremental chunk of data. They can also improve the responsiveness of your user experience. The same HTTP API can be reused to power multiple different apps. For example, you could write your web frontend with a Javascript frameworks like Backbone.js, but reuse the same API to power a native iOS application. Follow the jump to read about how streaming APIs work, and how you can write one with Rack::Stream.
TL;DR
Rack::Stream is rack middleware that lets you write streaming API endpoints that understand HTTP, WebSockets, and EventSource. It comes with a DSL and can be used alongside other rackable web frameworks such as Sinatra and Grape.
What's Streaming HTTP?
Normally, when an HTTP request is made, the server closes the connection when it's done processing the request. For streaming HTTP, also known as Comet, the main difference is that the server doesn't close the connection and can continue sending data to the client at a later time.
To prevent the connection from closing, rack-stream uses Thin's 'async.callback' to defer closing the connection until either the server decides to close the connection, or the client disconnects.
Rack::Stream
Rack::Stream is rack middleware that lets you write streaming HTTP endpoints that can understand multiple protocols. Multiple protocols means that you can write an API endpoint that works with curl, but that same endpoint would also works with WebSockets in the browser. The simplest streaming API you can make is:
# config.ru # run with `thin start -p 9292` require 'rack-stream' class App def call(env) [200, {'Content-Type' => 'text/plain'}, ["Hello", " ", "World"]] end end use Rack::Stream run App
If you ran this basic rack app, you could then use curl to stream it's response:
> curl -i -N http://localhost:9292/ HTTP/1.1 200 OK Content-Type: text/plain Transfer-Encoding: chunked Connection: close Server: thin 1.3.1 codename Triple Espresso Hello World
This isn't very exciting, but you'll notice that the Transfer-Encoding
for the response is set to chunked
. By default, rack-stream will take any downstream application's response bodies and stream them over in chunks. You can read more about chunked transfer encoding on Wikipedia.
Let's spice it up a bit and build an actual firehose. This next application will keep sending data to the client until the client disconnects:
require 'rack-stream' class Firehose include Rack::Stream::DSL def call(env) EM.add_periodic_timer(0.1) { chunk "nChunky Monkey" } [200, {'Content-Type' => 'text/plain'}, ['Hello']] end end use Rack::Stream run Firehose
The first thing to notice is the Firehose rack endpoint includes Rack::Stream::DSL
. This are convenience methods that allow you to access env['rack.stream']
, which is injected into env
whenever you use Rack::Stream
. When a request comes in, the #call
method schedules a timer that runs every 0.1 seconds and uses the #chunk
method to stream data. If you run curl, you would see:
> curl -i -N http://localhost:9292/ HTTP/1.1 200 OK Transfer-Encoding: chunked Connection: close Server: thin 1.3.1 codename Triple Espresso Hello Chunky Monkey Chunky Monkey Chunky Monkey # ... more monkeys
rack-stream also allows you to register callbacks for manipulating response chunks, and controlling when something is sent with different callbacks. Here's a more advanced example with callbacks added:
require 'rack-stream' class Firehose include Rack::Stream::DSL def call(env) after_open do chunk "nChunky Monkey" close # start closing the connection end before_chunk do |chunks| chunks.map(&:upcase) # manipulate chunks end before_close do chunk "nGoodbye!" # send something before we close end [200, {'Content-Type' => 'text/plain'}, ['Hello']] end end use Rack::Stream run Firehose
If you ran curl now, you would see:
> curl -i -N http://localhost:9292/ HTTP/1.1 200 OK Transfer-Encoding: chunked Connection: close Server: thin 1.3.1 codename Triple Espresso HELLO CHUNKY MONKEY GOODBYE!
For details about the callbacks, see the project page.
Up until this point, I've only used curl to demonstrate hitting the rack endpoint, but one of the big benefits of rack-stream is that it'll automatically recognize WebSocket and EventSource requests and stream through those as well. For example, you could write an html file that accesses that same endpoint:
<html> <body> <script type='text/javascript'> var socket = new WebSocket('ws://localhost:9292/'); socket.onopen = function() {alert("socket opened")}; socket.onmessage = function(m) {alert(m.data)}; socket.onclose = function() {alert("socket closed")}; </script> </body> </html>
Whether you access the endpoint with curl, ajax, or WebSockets, your backend API logic doesn't have to change.
For the last example, I'll show a basic chat application using Grape and Rails. The full runnable source is included in the examples/rails
directory.
require 'grape' require 'rack/stream' require 'redis' require 'redis/connection/synchrony' class API < Grape::API default_format :txt helpers do include Rack::Stream::DSL def redis @redis ||= Redis.new end def build_message(text) redis.rpush 'messages', text redis.ltrim 'messages', 0, 50 redis.publish 'messages', text text end end resources :messages do get do after_open do # subscribe after_open b/c this runs until the connection is closed redis.subscribe 'messages' do |on| on.message do |channel, msg| chunk msg end end end status 200 header 'Content-Type', 'application/json' chunk *redis.lrange('messages', 0, 50) "" end post do status 201 build_message(params[:text]) end end end
This example uses redis pubsub to push out messages that are created from #post
. Thanks to em-synchrony, requests are not blocked when no messages are being sent. It's important do the redis subscribe after the connection has been opened. Otherwise, the initial response won't be sent.
What about socket.io?
socket.io is great because it provides many transport fallbacks to give maximum compatibility with many different browsers, but its pubsub interface is too low level for capturing common app semantics. The application developer doesn't have nice REST features like HTTP verbs, resource URIs, parameter and response encoding, and request headers.
The goal of rack-stream is to provide clean REST-like semantics when you're developing, but allow you to swap out different transport protocols. Currently, it supports normal HTTP, WebSockets, and EventSource. But the goal is to support more protocols over time and allow custom protocols. This architecture allows socket.io to become another protocol handler that can be plugged into rack-stream. If you wanted to use Pusher as a protocol, that could also be written as a handler for rack-stream.
Summary
rack-stream aims to be a thin abstraction that lets Ruby developers write streaming APIs with their preferred frameworks. I plan to broaden support and test against common use cases and popular frameworks like Sinatra and Rails. If you have any questions or comments, feel free to submit an issue or leave a comment below!