๐งน Maintenance¶
Over time a pipe's target table accumulates dead rows, stale planner statistics, and uncompressed history. Meerschaum provides a family of maintenance actions to keep tables lean and queries fast. Each action selects pipes with the standard -c / -m / -l / -i / -t filters and prints a per-pipe results table.
| Action | Purpose | Pipe method |
|---|---|---|
show sizes |
Report on-disk size per pipe | Pipe.get_size() |
show partitions |
Report partition / chunk counts and width | โ |
compress pipes |
Reclaim space by compressing history | Pipe.compress() |
decompress pipes |
Reverse compression | Pipe.decompress() |
vacuum pipes |
Reclaim dead-row space | Pipe.vacuum() |
analyze pipes |
Refresh planner statistics | Pipe.analyze() |
partition pipes |
Rebuild to a new partition width | Pipe.repartition() |
Instance connector support
These operations are backed by instance-connector methods (get_pipe_size(), compress_pipe(), vacuum_pipe(), etc.). The SQLConnector implements them per-flavor, and they are wired through api: instances. A connector that doesn't support an operation returns an informative failure rather than raising.
๐พ Disk Usage¶
Run show sizes to list each pipe's target table size on disk, largest first:
In Python, Pipe.get_size() returns the number of bytes (or None if the size can't be determined):
import meerschaum as mrsm
pipe = mrsm.Pipe('demo', 'weather', instance='sql:main')
print(pipe.get_size())
# 1859584
Sizes are measured with each flavor's native query โ TimescaleDB hypertable size functions, pg_total_relation_size() for PostgreSQL / PostGIS (summed across partitions), data_length + index_length for MySQL / MariaDB, reserved pages for MSSQL, and dbstat for SQLite.
๐๏ธ Compression¶
For large historical tables, compression can dramatically reduce disk usage. Run compress pipes to compress a pipe's history:
The mechanism depends on the flavor:
- TimescaleDB โ enables the Hypercore columnstore, installs a columnstore (compression) policy, and converts existing chunks. By default (the
hypercoreparameter) a new hypertable already has the columnstore enabled; thecompressparameter additionally (re)installs a policy on sync. - MySQL / MariaDB โ
ROW_FORMAT=COMPRESSED. - MSSQL โ
DATA_COMPRESSION = PAGE.
One-shot compression
Pass --no-policy to compress existing data now without installing an ongoing policy (and, for decompress pipes, to decompress now while leaving the policy in place):
Reverse compression with decompress pipes, the inverse of compress pipes:
For TimescaleDB this removes the columnstore policy, converts compressed chunks back to row-store, and disables the columnstore so future synced chunks stay uncompressed.
Mark a pipe for automatic compression
Set the compress parameter (a bool or a dict of after / segmentby / orderby) so a policy is installed automatically on sync:
๐งฝ Vacuuming and Analyzing¶
vacuum pipes reclaims space left by deleted or updated rows:
On the PostgreSQL family a plain VACUUM reclaims dead tuples internally but does not return space to the operating system โ pass --full to run VACUUM FULL, which rewrites the table (taking an exclusive lock):
Other flavors fall back to their native mechanisms: OPTIMIZE TABLE for MySQL / MariaDB, an index rebuild for MSSQL, and VACUUM (of the database file) for SQLite.
analyze pipes refreshes the database planner's statistics so it chooses better query plans after a large sync. Unlike vacuuming, it does not reclaim disk space:
๐งฉ Partitions¶
For pipes with native range partitioning (or TimescaleDB hypertables), show partitions reports the partition / chunk count, the physical partition width, and the approximate number of rows per partition โ a useful signal for tuning the width:
Repartitioning¶
verify.chunk_minutes is the authoritative partition width, read at sync time. To change the width of an existing table, use partition pipes (don't just edit the parameter โ a changed width laid over an existing grid produces overlapping partitions):
# Rebuild the 'weather' pipes to 7-day partitions.
mrsm partition pipes -i sql:main -m weather --chunk-minutes 10080
- TimescaleDB applies the new interval to future chunks via
set_chunk_time_interval(); existing chunks keep their size (no rewrite). - PostgreSQL / PostGIS, MySQL / MariaDB, MSSQL rebuild the table at the new width: the data is read, the table is dropped, and the data is re-synced.
Rebuild cost
The non-TimescaleDB rebuild reads the whole table into memory and briefly drops it before re-syncing. Run it during a maintenance window for large tables, and choose a sensible width up front to avoid repartitioning later.
๐ค Scheduling Maintenance¶
Like any action, maintenance can be scheduled as a background job. For example, compress and vacuum nightly: