📦️ Compose Projects¶
Now that you've got the idea behind pipes, let's write our own plugins and put together a project using Meerschaum Compose.
Meerschaum Compose Template Repository
If you'd like to jump straight into a Dockerized environment, create a new repository from the Meerschaum Compose Project Template.
Create a new project directory awesome-sauce
. This is where our code and data will live (for now).
1 2 |
|
Paste the following into mrsm-compose.yaml
. This defines our pipes and runtime environment.
mrsm-compose.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
|
There are two Connectors used in this project:
plugin:fred
A plugin we will write shortly. This is the data source for our initial pipes.sql:tiny
A SQLite filetiny.db
. It's used as the instance connector as well as a data source.
The SQLite database makes sense, but what is plugin:fred
? FRED is the data source we want to use for this project, so let's create our first plugin to fetch this data.
Create a new directory plugins
:
1 2 |
|
And paste the following into fred.py
:
fred.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
The plugin provides one function fetch()
that takes a pipe, pulls an ID from pipe.parameters
, and returns the appropriate DataFrame.
Now that we've got our YAML file and plugin, install the compose
plugin:
1 |
|
Let's initialize our environment to install our dependencies (pandas
) into the project's virtual environment for plugin:fred
. From the parent project directory, run compose init
:
1 2 |
|
Run the project file to sync the pipes one-at-a-time:
1 |
|
🎉 Success! You've just run an ETL pipeline to process the following steps:
- Ingest eggs prices in the US.
- Ingest chicken prices in the US.
- Join the tables on
DATE
.
Grow the table horizontally by splittingPRICE
intoPRICE_EGGS
andPRICE_CHICKEN
. - Union the tables.
Grow the table vertically by adding the indexFOOD
('chicken'
or'eggs'
).
All other Meerschaum actions are executed within the context of this project. For example, let's verify the most recent data with show data
:
1 |
|
When developing, it's useful to hop into a REPL and test out the Python API.
1 |
|
Thank you for making it through the Getting Started guide! This example was based on the May 2023 Tech Slam 'N Eggs demo project.
There's plenty more great information, such as the plugins guide. Have fun building your pipes with Meerschaum!