Usage¶
CLI¶
Rabbit Hole is a command line tool that has been written as a lightweight alternative to logstash for the specific use case in which the input is an amqp server and the output is a SQL database.
It can be executed from the command line like this:
$ rabbithole config.yml
where config.yml is a YAML configuration file. For example:
size_limit: 5
time_limit: 15
blocks:
- name: input
type: amqp
kwargs:
url: 'ampq://username:password@localhost:5672'
- name: output
type: sql
kwargs:
url: 'postgres://username:password@localhost:5432/db_name'
flows:
- - name: input
kwargs:
exchange: logs
exchange_type: fanout
durable: true
- name: output
kwargs:
query:
INSERT INTO logs (timestamp, message)
VALUES (CAST (:timestamp AS TIMESTAMP), :message)
parameters:
timestamp: timestamp
message: message.text
- - name: input
kwargs:
exchange: events
exchange_type: fanout
durable: true
- name: output
kwargs:
query:
INSERT INTO events (timestamp, message)
VALUES (CAST (:timestamp AS TIMESTAMP), :message)
parameters:
timestamp: timestamp
message: message.text
- where:
- size_limit: batcher size limit
- time_limit: batcher size limit
- blocks: list of building blocks to use in the flows
- flows: list of blocks connected to transfer information information
Blocks¶
A block rabbithole is the name of the little piece that can be added to a flow to receive/send messages as needed to build the desired flow of information. There are currently three different kinds of blocks:
input
an input block is a block that receives a messages from an external source, such as an amqp server, and transfers them as they are received to the next block in the flow.batchers
rabbithole uses the concept of batchers that is also used in logstash. A batcher is just an in-memory queue whose goal is to output data more efficiently by writing multiple messages at once. It keeps messages in memory until its capacity has been filled up or until a time limit is exceeded. Both parameters can be set in the configuration file.
Batchers are automatically added between blocks in a flow, so there’s no need to include them explicitly in the configuration file.
output
an output block is a block that receives messages from the previous block and sends them to an external output such as a database.
Flow¶
A flow is a sequence of blocks that are connected to transfer information from the initial input block to the final output one.
Available blocks¶
The following blocks are available in rabbithole.
amqp¶
ampq is an input flow that can receive data from amqp servers.
blocks:
- name: input
type: amqp
kwargs:
url: 'ampq://username:password@localhost:5672'
flows:
- - name: input
kwargs:
exchange: logs
exchange_type: fanout
durable: true
- where:
- url: is the AMQP connection string.
- exchange is the name of the exchange for which messages will be transferred in a given flow.
- additonal parameters are optional and passed directly to pika.channel.Channel.exchange_declare.
sql¶
sql is an output flow that can write data to SQL databases.
blocks:
- name: output
type: sql
kwargs:
url: 'postgres://username:password@localhost:5432/db_name'
flows:
- - name: output
kwargs:
query:
INSERT INTO logs (timestamp, message)
VALUES (CAST (:timestamp AS TIMESTAMP), :message)
parameters:
timestamp: timestamp
message: message.text
- where:
- url is the database connection string.
- query is the query to execute when a message is received in a given flow.
- parameters is an optional mapping from the message received to the object pased to the query (useful when the message contains nested data since nesting is not supported in query parameters).