implementing a storage

Ruote comes with two storages, an in-memory one and a file system based one. There are other storages available each with its advantages and disadvantages.

There is also a composite storage for choosing a different storage depending on the object to store (expression, workitem, error, schedules, …).

Sometimes one has to implement his own storage, this page ought to help (if it doesn’t give a shout on the mailing list).

A storage implementation has to follow a contract : the storage interface.

a bit of history

Storages are at the heart of ruote 2.1.×. Simply look at :

            engine = Ruote::Engine.new(Ruote::Worker.new(ThatStorage.new(info)))
              # or
            engine = Ruote::Engine.new(ThatStorage.new(info))
            

Workers and engines need a storage, they communicate via a shared storage.

Ruote has an absolute need for persistence. Processes may outlive the engine runtime. When the engine(s) is/are down, the processes must sleep carefully in the storage, when workers are back, processes MUST resume.

Ruote 2.1.x push for multiple workers (breaking the 1 Ruby runtime limit) brought a need for preventing workers from overriding / trashing each other’s work.

documents

Ruote 2.1.x storage design was strongly influenced by CouchDB, especially the concept of documents and their revisions.

Ruote represents all of its data as documents, those documents are JSON serializable hashes with an “_id”, a “type” and a “_rev”.

The ‘type’ is a string like “expressions”, “errors”, “workitems”, “schedules”, …

The ‘_id’ attribute is the document identifier.

document types

There are 7+1 types of documents:

type description
expressions the atomic pieces of process instance
msgs ‘messages’ to apply/reply to expressions
schedules msgs scheduled for later processing
errors errors that occured during process execution (msgs processing)
variables engine (global) variables
configurations engine configurations
workitems StorageParticipant workitems
history (if you use Ruote::StorageHistory)

TODO

consistency

Ruote requires some documents to be reserved. If you can not offer exclusive
updates with your storage implementation you are limited to a single
worker

See the configuration page for more information on workers in the context of Ruote

interface

            #--
            # core methods
            #++
            
            # Store a document
            #
            # opts:
            # ... TODO ...
            #
            # returns:
            # * true if the document has been deleted from the store
            # * a document when the rev has changed
            # * nil when successfully stored
            #
            def put(doc, opts={})
            end
            
            # get a document by document type and key (_id)
            #
            def get(type, key)
            end
            
            # Delete a document
            #
            # returns:
            # * true if already deleted
            # * a document if the rev of the given document is not the rev of the current
            #   document
            # * nil when successfully removed
            #
            def delete(doc)
            end
            
            # Get many documents of a certain type
            #
            # input:
            # - When key is not specified, return everything from the store
            # - Else, do Array(key) and match the collected document id to the given list
            #
            # opts:
            # [:descending]    Return a list, sorted by _id in descending order
            # [:count]         Just return a count
            # [:skip]          Skip X results
            # [:limit]         Limit the list to a length of X
            #
            # returns:
            # * An Integer (when :count was specified)
            # * An array of documents
            #
            def get_many(type, key=nil, opts={})
            end
            
            # Return a list of ids for the given document type
            #
            def ids(type)
            end
            
            #--
            # auxiliary methods
            #++
            
            # Removes all msgs, schedules, errors, expressions and workitems.
            #
            # It's used mostly when testing workflows, usually when cleaning the
            # engine/storage before a workflow run.
            #
            def clear
            end
            
            # Clean the store
            #
            def purge!
            end
            
            # Add a new document type to the store. Some storages might need it.
            #
            def add_type(type)
            end
            
            # Clean the store for the given document type
            #
            def purge_type!(type)
            end
            

TODO

current implementations as models

Like for example hash_storage.rb

TODO

testing

Your storage implementation should survive unit testing and functional testing. If it is meant to be used by multiple workers it should survive concurrence tests as well.

Tests are run from the ruote/ directory. You pass the suffix of the implementation name (like “redis” or “sequel”) and the test will automatically

For the rest of this section, let’s assume you’ve named your storage “ruote-pasta” and you’ve placed it in a ruote-pasta/ dir adjacent to your ruote/ clone.

Your storage implementation must provide a test/functional_connection.rb file containing a “new_storage” method. This will indicate to the ruote test suites how to set up your storage for testing. As examples, look at the functional_connection.rb files for ruote-redis and ruote-sequel.

unit testing

Among the ruote unit tests, there is only one that is relevant when testing storages.

            $ ruby test/unit/storage.rb -- --pasta
            

Making this test green is your first objective when developing a storage.

functional testing

Your storage implementation should survive all the functional tests.

            $ ruby test/functional/test.rb -- --pasta
            

If you have difficulties with some dangling test, seek help on the mailing list.

concurrence tests (multiple workers)

There are currently 3 tests that push ruote storages hard, as if multiple workers were running. They are found under test/functional/ and are prefixed with “ct_”.

The technique used to make sure a storage is multi-worker enabled can be described as “run each of the ct_ tests 200 times and if it never fails, then the storage implementation is OK with multiple workers”.

            $ ./test/looper.sh test/functional/ct_0_concurrence.rb -- --pasta 200
            $ ./test/looper.sh test/functional/ct_1_iterator.rb -- --pasta 200
            $ ./test/looper.sh test/functional/ct_2_cancel.rb -- --pasta 200
            

Those tests try to force collisions among workers, hopefully witnessing the storage implementation work around the issue gracefully.