Skip to content

🎼 Meerschaum Compose

The compose plugin does the same for Meerschaum as Docker Compose does for Docker: with Meerschaum Compose, you can consolidate everything into a single YAML file ― that includes all of the pipes and configuration needed for your project!

With mrsm compose up, you can stand up syncing jobs for your pipes defined in the Compose project ― one job per instance. Because the configuration is contained in the YAML file (e.g. custom connectors), Compose projects are useful for prototyping, collaboration, and consistency.

Multiple Compose Files

For complicated projects, a common pattern is to include multiple Compose files and run them with --file:

1
2
3
mrsm compose run --file mrsm-compose-00-extract.yaml && \
mrsm compose run --file mrsm-compose-01-transform.yaml && \
mrsm compose run --file mrsm-compose-02-load.yaml

This pattern allows multiple projects to cleanly share root and plugins directories.

Example Compose File

This compose project demonstrates how to sync two pipes to a new database awesome.db:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
sync:
  schedule: "every 30 seconds"

pipes:
- connector: "plugin:noaa"
  metric: "weather"
  location: "atlanta"
  parameters:
    noaa:
      stations:
        - "KATL"

- connector: "sql:awesome"
  metric: "fahrenheit"
  target: "atl_temp"
  parameters:
    query: |-
      SELECT
        timestamp,
        station,
        (("temperature (degC)" * 1.8) + 32) AS fahrenheit
      FROM plugin_noaa_weather_atlanta
    columns:
      datetime: "timestamp"
      id: "station"

plugins:
  - "noaa"

config:
  meerschaum:
    instance: "sql:awesome"
    connectors:
      sql:
        awesome:
          database: "awesome.db"
          flavor: "sqlite"

⛺ Setup

Template Project

Want to skip the setup and work in a pre-configured environment? Create a new repository from the Meerschaum Compose Project Template.

Install the compose plugin from the public repository api:mrsm:

1
mrsm install plugin compose

From a new directory, create a file mrsm-compose.yaml. You can paste the example file above to get started.

1
2
3
mkdir awesome-sauce && \
  cd awesome-sauce && \
  vim mrsm-compose.yaml

Plugins Directories

You may set multiple paths for plugins_dir. This is very useful if you want to group plugins together. A value of null will include the environment's plugins in your project.

1
2
3
plugins_dir:
  - "./plugins"
  - null

🪖 Commands

If you've used docker-compose, you'll catch on to Meerschaum Compose pretty quickly. Here's a quick summary:

Command Description Useful Flags
compose init Initialize a new project and install dependencies.
compose run Update and sync the pipes defined in the compose file. --debug: Verbosity toggle. All flags are passed to sync pipes.
compose up Bring up the syncing jobs (process per instance) -f: Follow the logs once the jobs are running.
compose down Take down the syncing jobs. -v: Drop the pipes ("volumes").
compose logs Follow the jobs' logs. --nopretty: Print the logs files instead of following.
compose ps Show the running status of background jobs.

For our example project awesome-sauce, let's bring up the syncing jobs:

1
mrsm compose up -f
All other commands are executed as regular actions from within the project environment.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
### Print out the environment variables set by the compose file.
mrsm compose show environment

### Verify that custom connectors are able to be parsed.
mrsm compose show connectors sql:awesome

### The default instance for this project is sql:awesome, so pipes will be fetched from there by default.
mrsm compose show pipes
mrsm compose show columns
mrsm compose show rowcounts

### Jump into an interactive CLI.
mrsm compose sql awesome

🎌 Flags

The compose plugin adds a few new custom flags. You can quickly view the available flags with mrsm -h or mrsm show help.

Flag Description Example
-f Follow the logs when running compose up. mrsm compose up -f
-v, --volumes Delete pipes when running compose down. mrsm compose down -v
--dry For compose up, update the pipes' registrations but don't actually begin syncing. mrsm compose up --dry
--file, --compose-file Specify an alternate compose file (default: mrsm-compose.yaml). mrsm compose show connectors --file config-only.yaml
--env, --env-file Specify an alternate environment file (default: .env). mrsm compose show environment --env secrets.env

🧬 Schema

Below are the supported top-level keys in a Compose file. Note that all keys are optional.

  • pipes
    List all of the pipes to be used in this project. See the The pipes Key section below.
  • sync
    Govern the behavior of the syncing process. See The sync Key section below.
  • jobs
    If provided, compose up will start the defined jobs. See The jobs Key below.
  • project_name (default to directory name)
    The tag given to all pipes in this project. Defaults to the current directory. If you're using multiple compose files, make sure each file has a unique project name.
  • root_dir (default ./root/)
    A path to the root directory; see MRSM_ROOT_DIR.
  • plugins_dir (default ./plugins/)
    Either a path or list of paths to the plugins directories. A value of null will include the current environment plugins directories in the project. See MRSM_PLUGINS_DIR.
  • plugins
    A list of plugins expected to be in the plugins directory. Missing plugins will be installed from api:mrsm.
    To install from a custom repository, append @api:<label> to the plugins' names or set the configuration variable meerschaum:default_repository.
  • config
    Configuration keys to be patched on top of your host configuration, see MRSM_CONFIG. Custom connectors should be defined here.
  • environment
    Additional environment variables to pass to subprocesses.

Accessing the host configuration

The Meerschaum Compose YAML file also supports Meerschaum symlinks. For example, to alias a new connector sql:foo to your host's sql:main:

1
2
3
4
config:
  meerschaum:
    sql:
      foo: MRSM{meerschaum:connectors:sql:main}

The pipes Key

The pipes key contains a list of keyword arguments to build mrsm.Pipe objects, notably:

  • connector (required)
  • metric (required)
  • location (default null)
  • instance (default to config:meerschaum:instance)
  • parameters
  • columns (alias for parameters:columns)
  • target (alias for parameters:target)
  • tags (alias for parameters:tags)
  • dtypes (alias for parameters:dtypes)

The sync Key

Keys under the parent key sync are the following:

  • schedule
    Define a regular interval for syncing processes by setting a schedule.
    This corresponds to the flag -s / --schedule.
  • min_seconds (default 1)
    If a schedule is not set, pipes will be constantly synced and sleep min_seconds between laps.
    This corresponds to the flags --min-seconds and --loop.
  • timeout_seconds
    If this value is set, long-running syncing processes which exceed this value will be terminated.
    This corresponds to the flag --timeout-seconds / --timeout.
  • args
    This value may be a string or list of command-line arguments to append to the syncing command.
    This option is available for specific edge case scenarios, such as when working with custom flags or specific intervals (i.e. --begin and --end).

The jobs Key

Keys under jobs are the names of jobs to be run with compose up. If defined, these jobs override the default syncing jobs.

Example jobs

1
2
3
4
5
jobs:
  sync: "sync pipes -s 'every 2 hours starting 00:30'"
  verify: "verify pipes -s 'daily starting 12:00 tomorrow'"
  date: "date -s 'every 10 seconds'"
  echo: "echo 'Hello, World!'"