๐ชต Changelog¶
3.0.x Releases¶
This is the current release cycle, so stay tuned for future releases!
v3.0.9 โ v3.0.10¶
-
Disable CLI daemon by default.
New installations will now disable the CLI daemon, which may be enabled by settingsystem.experimental.cli_daemontotrue. -
Support
geopackageSQLConnectors.
The flavorgeopackage(extension ofsqlite) enables syncing geometry data as GeoPackage-WKB (as well as the necessarygpkgtables if missing). -
Check for missing plugins paths for
bootstrap plugin.
Thebootstrap pluginwizard now ensures that configured plugins paths (i.e.MRSM_PLUGINS_DIR) exist before proceeding. -
Add FontAwesome icons to the Dash app.
Plugins using the@web_pagefeature will now have access to FontAwesome icons, which are now statically hosted. -
Fix the Valkey stack data directory.
The move frombitnami/valkeytovalkey/valkeyincorrectly pointed the Valkey volume to/valkey/datainstead of/data. This error has been rectified.
v3.0.6 โ v3.0.8¶
-
Fix stack config for new deployments.
New stack deployments behave as expected when symlinks fail to resolve. -
Handle other null-like values when syncing to PostgreSQL.
The null detection has been improved when syncing in bulk to PostgreSQL. -
Fix web dashboard SQL editor.
The SQL editor on the web dashboard now correctly updates SQL queries. -
Add "Resolve symlinks" to web dashboard SQL Editor.
The SQL Editor on the pipes cards now includes a toggle to resolve symlinks in the SQL query, allowing you to make changes to the query without overwriting the symlinks. -
Improve performance when fetching pipes' tags on the web dashboard.
Cache misses are now better handled when fetching tags and other keys on the web dashboard. -
Preserve symlinks when copying pipes.
The actioncopy pipesnow preserves symlinks for new pipes. -
Fix edge case for
unload_plugins().
When an empty list is explicitly provided, the behavior now does not unload all plugins.
v3.0.3 โ v3.0.5¶
-
Fix pipe page routing.
Links to specific pipes in the dashboard are now routed correctly. -
Fix cache invalidation for pipes using Valkey connectors.
Pipes using avalkeyconnector to store cache (rather than on-disk) will now correctly invalidate cache on demand. -
Fix Python 3.9 compatability.
An issue breaking Python 3.9 has been fixed. -
Prevent resolving symlinks for pipe's parameters card.
The pipe card in dash will no longer resolve symlinks. -
Handle
pd.NAvalues in JSON columns forfilter_unseen_df().
Non-string values (e.g.pd.NA) are now correctly handled for JSON columns infilter_unseen_df(). -
Resolve dependencies when running the
sqlCLI.
Running withsqlaction will now resolve expected dependencies. -
Refine unloading plugins.
Rather than always popping the rootpluginspackage fromsys.modules, unloaded plugins are instead deleted directly from the root package. The rootpluginspackage is popped only if all plugins are unloaded at once. -
Disable
uvwhen running inside a virtual environment.
v3.0.1 โ v3.0.2¶
-
Change the working directory for each action exected by the CLI daemon.
The CLI daemon now changes working directory to match the context of the calling client. This handles relative file paths in environment variables (e.g.MRSM_PLUGINS_DIR). -
Fix environment variables handling within the CLI daemon.
Certain environment variables interfered with the shell and the Daemon, and this case has been handled. -
Reload the CLI daemon after upgrading packages.
Installing or upgrading packages now reloads the CLI daemon. -
Invalidate symlinks check cache when unloading plugins.
Unloading plugins now reverts the internal_synced_symlinkscheck. -
Unload the root
pluginspackage when unloading plugins.
The rootpluginspackage is now unloaded whenunload_plugins()is called. This invalidates lingering cache from previously loaded plugins.
v3.0.0¶
- Inherit another pipe's base parameters with
reference.
Pipes may inherit the base parameters of other pipes by setting the keyreference:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | |
- Dynamically symlink to other pipes' attributes.
Reference attributes of other pipes using the{{ Pipe(...) }}syntax. These references are resolved at run-time whenPipe.parametersis accessed:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | |
- Add
MRSMconfig symlinks to pipe parameters.
Similar to the{{ Pipe(...) }}syntax, you can now reference your Meerschaum configuration from within a pipe's parameters:
1 2 3 4 5 6 7 | |
1 2 3 4 5 6 7 8 9 10 | |
-
Add
Pipe.update_parameters().
Due to the symlinking features, the methodPipe.update_parameters()will now appropriately handle updating the parameters within the pipe's attributes. Therefore, mutatingPipe.parametersno longer affects the state of the pipe directly. -
Add
InstanceConnectorbase class.
Custom connectors which implement the instance connectors interface should now inherit fromInstanceConnectoras the base class:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
- Add long-lived authentication tokens.
You may now register tokens to programmatically authenticate your applications to a Meerschaum API instance. This is ideal for use cases such as CI/CD, IoT, and other automated workloads. Tokens are restricted by scopes, may expire or be invalidated, and are owned by a user account. Tokens may be managed via the CLI or web console (at/dash/tokens, underSettings>Tokens).
1 | |
-
Add scopes to user accounts.
Similar to tokens, users may be restricted by scopes. Runedit userto edit thescopesattribute for a given user (more comprehensive editing to come). -
Set
coerce_typestoTrueinquery_df()if any exclude parameters are provided.
Prefacing a value with the negation prefix inparamsforquery_df()will now forcecoerce_typesto beTrue.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
- Project geometry data to WGS84 (EPSG:4326) when serializing as GeoJSON.
Settinggeometry_formattogeojsonforto_json()(default) (andserialize_geometry(), though not default) will project to WGS84 if a CRS is provided (to meet the 2016 GeoJSON specification).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | |
-
Upgrade
sql:mainto PostgreSQL 17.
The Meerschaum stack database now ships astimescale/timescaledb-ha:pg17, which includes additional extensions such as PostGIS. The flavor forsql:mainis nowtimescaledb-ha(see below). -
Add the SQL flavor
timescaledb-ha.
The default instance connectorsql:mainnow has the flavortimescaledb-ha, corresponding to thetimescale/timescaledb-haDocker image. This image includes PostGIS,timescaledb_toolkit, andpg_stat_statements. -
Add support for sets and Series in
query_df().
Sets and Pandas Series withinparamswill now be treated as lists. -
Allow for spaces and an optional
mrsm.prefix for templated SQL query definitions.
The template format{{Pipe(...)}}will now match leading and trailing spaces around thePipedeclaration, and an optionalmrsm.prefix is accepted. -
Add
Pipe.autotime.
Similar toPipe.autoincrement, settingautotimewill capture the current timestamp for the value of thedatetimeaxis (as datetimes or integers, depending on the dtype):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | |
- Add
Pipe.precision.
The parameterprecisiondetermines the precision (and therefore the size) of the timestamp captured byautotime. Accepted values arenanosecond,microsecond,millisecond,second,minute,hour, andday. The default value ofprecisionis derived from the dtype of thedatetimeaxis (i.e.datetime64[ns]isnanosecondprecision).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | |
The parameter precision may either be a string (the unit) or a dictionary with the following keys:
unit(required)
The precision unit (microsecond,second, etc.)interval(default 1)
Optionally round to a specific number of units. For example,precision='minute'andinterval=15would round to 15-minute intervals.-
round_to
To which direction to round the current timestamp. Supported values aredown(default),up, andclosest. Seemeerschaum.utils.dtypes.round_time(). -
Add
Pipe.get_value().
The convenience functionPipe.get_value()selects single values from result sets ofPipe.get_data():
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | |
- Add
Pipe.get_doc().
Similar toPipe.get_value(), the methodPipe.get_doc()will return a single row as a dictionary:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | |
-
Add the parameter
mixed_numerics.
Settingmixed_numericstoFalsewill prevent the behavior of coercing integer to float columns asnumeric, akin tostatic=Truebut just for this behavior. -
Introducing the Meerschaum CLI daemon.
Actions are now routed through a long-lived daemon process, simplifying the number of active connections and cutting latency for CLI commands. The shell includes the commanddaemonwhich temporarily toggles between the CLI daemon and executing in-process. Adding the flag--no-daemonto any action will disable the CLI daemon routing, and the CLI daemon may be disabled by settingsystem.experimental.cli_daemontofalse. You may selectively allow or restrict actions prefixes under the configurationsystem.cli.allowed_prefixes(default*) andsystem.cli.disallowed_prefixes. -
Performance improvements through smarter caching.
The metadata caching system has been overhauled, drastically reducing redundant work and increasing performance. Pipes' metadata are cached on-disk, and providingcache_connector_keysto the Pipe constructor will cache to a Valkey instance instead. Setcache=Falseto disable this behavior. -
Restrict Webterms to API processes.
In previous releases, a single webterm process was shared amongst API processes. Now each API process requires its own Webterm server, andtmuxsessions are separated by port. -
Fix custom actions with spaces in Web Console.
-
Ignore
schemafrom pipes' parameters on SQLite. -
Tweak
beginandendinput sizes in the pipes card.
2.9.x Releases¶
The 2.9 series added support for geometry data and improved the web console development experience.
v2.9.5¶
-
Add the
Query Datadropdown to pipes' cards.
Similar to theRecent Datadropdown, you can now explore a pipe's data with theQuery Datadropdown. This supportsbegin,end,params, andlimitfiltering. -
Additional fixes for in-place syncs on PostGIS.
Geometry columns are now correctly coalesced for PostGIS.
v2.9.2 โ v2.9.4¶
-
Fix in-place syncs with
GEOMETRYcolumns. The query to retrieveGEOMETRYdata types for PostGIS has been fixed. -
Consolidate 1-minute chunks in
Pipe.get_chunk_bounds().
The chunk bounds returned fromPipe.get_chunk_bounds()(without specifiying anend) now consolidate the 1-minute chunk into the last chunk:
1 2 3 4 5 6 7 8 9 10 11 12 13 | |
- Remove unnecessary
CREATE SCHEMA IF EXISTSchecks. - Fix unsupported syntax for Python 3.8.
- Truncate long connector keys in the Shell prompt.
- Fix null location keys behavior for pipes cards.
- Fix autoincrementing primary keys for custom schemas.
v2.9.0 โ v2.9.1¶
- Add the dtype
geometry(andgeography).
The new data typegeometryadds support for syncing GIS data (e.g. WKB/WKT, GeoJSON,shapelyobjects, andGeoDataFrames).
1 2 3 4 5 6 7 8 9 10 11 12 | |
The geometry dtype syntax supports constraints for Geometry type and/or CRS (SRID):
1 2 3 | |
Syncing a GeoDataFrame (without specifying an explicit dtype) will detect any CRS and geometry type:
1 2 3 4 5 6 7 8 9 10 11 | |
-
Add the SQLConnector flavor
postgis.
The new flavorpostgis(built atop thepostgresqlflavor) allows pipes to take natively supportGEOMETRY(andGEOGRAPHY) types. -
Add a pages sidebar to the Web Console.
Clicking the Meerschaum logo on the Web Console will show the pages navigation sidebar, allowing you to easily expand the web app. Custom pages added via@web_pageare grouped by plugin. You can override this behavior with thepage_groupparameter:
1 2 3 4 5 6 7 8 9 10 11 | |
-
Add the property
instance_keystoapiconnectors.
The optional propertyinstance_keysdetermines the value ofinstance_keysto be sent alongside pipe requests. -
Insert the Web Console navbar into custom pages.
Custom pages added via@web_pagewill now include the simple navbar by default, to more tightly integrate custom pages into the Web Console. This behavior may be disabled by settingskip_navbar=Truein@web_page:
1 2 3 4 5 6 7 8 9 10 11 12 | |
-
Create
INTcolumns for dtypesint32,SMALLINTforint16.
TheSQLConnectornow maps the Pandas dtypesint32toINTandint16(andint8) toSMALLINTrather than defaulting toBIGINTfor everything. -
Fix serialization of
valkeypipes without indices.
Pipes synced withoutcolumnsnow correctly serialize documents' keys. -
Add API endpoints for clearing pipes and chunk bounds.
The endpoints/pipes/{connector_keys}/{metric_key}/{location_key}/clearand/pipes/{connector_keys}/{metric_key}/{location_key}/chunk_boundsnow allow API users to clear pipes (rather than using the legacy actions endpoint) and get the values frompipe.get_chunk_bounds(). -
Skip venv locking on Windows.
- Shrink
fullDocker image size.
2.8.x Releases¶
The 2.8 series introduced batches to verify pipes as well as more granular control over exposed instances via the API.
v2.8.4¶
- Allow for pattern matching in
allowed_instance_keys.
You may now generalize the instances exposed by the API by using Unix-style patterns in the listsystem:api:permissions:instances:allowed_instance_keys:
1 2 3 4 5 6 7 8 9 10 11 12 | |
-
Return pipe attributes for the route
/pipes/{connector}/{metric}/{location}.
The API routes/pipes/{connector}/{metric}/{location}and/pipes/{connector}/{metric}/{location}/attributesboth return pipe attributes. -
Check entire batches for
verify rowcounts.
The commandverify rowcountswill now check batch boundaries before checking row-counts for individual chunks. This should moderately increase performance. -
Kill orphaned child processes when the parent job is killed.
Jobs created with pipeline arguments should now kill associated child processes. -
Add
--skip-hooks.
The flag--skip-hooksprevents any sync hooks from firing when syncing pipes. -
Remove datetime rounding from
parse_schedule().
Scheduled actions now behave as expected โ the current timestamp is no longer rounded to the nearest minute, which was causing issues with thestarting indelay feature. -
Fix
allowed_instance_keysenforcement.
v2.8.3¶
- Increase username limit to 60 characters.
- Add chunk retries to
Pipe.verify(). - Add instance keys to remaining pipes endpoints.
- Misc bugfixes.
v2.8.0 โ v2.8.2¶
- Add batches to
Pipe.verify().
Verification syncs now run in sequential batches so that they may be interrupted and resumed. SeePipe.get_chunk_bounds_batches()for more information:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | |
-
Add
--skip-chunks-with-greater-rowcountstoverify pipes.
The flag--skip-chunks-with-greater-rowcountswill compare a chunk's rowcount with the rowcount of the remote table and skip if the chunk is greater than or equal to the remote count. This is only applicable for connectors which implementremote=Truesupport forget_sync_time(). -
Add
verify rowcounts.
The actionverify rowcounts(same as passing--check-rowcounts-onlytoverify pipes) will compare row-counts for a pipe's chunks against remote rowcounts. This is only applicable for connectors which implementget_pipe_rowcount()with support forremote=True. -
Add
remotetopipe.get_sync_time().
For pipes which support it (i.e. theSQLConnector), the optionremoteis intended to return the sync time of a pipe's fetch definition, like the optionremoteinPipe.get_rowcount(). -
Allow for the Web API to serve pipes from multiple instances.
You can disable this behavior by settingsystem:api:permissions:instances:allow_multiple_instancestofalse. You may also explicitly allow which instances may be accessed by the WebAPI by setting the listsystem:api:permissions:instances:allowed_instance_keys(defaults to["*"]). -
Fix memory leak for retrying failed chunks.
Failed chunks were kept in memory and retried later. In resource-intensive syncs with large chunks and high failures, this would result in large objects not being freed and hogging memory. This situation has been fixed. -
Add negation to job actions.
Prefix a job name with an underscore to select all other jobs. This is useful for filtering out noise forshow logs. -
Add
Pipe.parent.
As a quality-of-life improvement, the attributePipe.parentwill return the first member ofPipe.parents(if available). -
Use the current instance for new tabs in the Webterm.
Clicking "New Tab" will open a newtmuxwindow using the currently selected instance on the Web Console. -
Other webterm quality-of-life improvements.
Added a size toggle button to allow for the webterm to take the entire page. -
Additional refactoring work.
The API endpoints code has been cleaned up. -
Added system configurations.
New options have been added to thesystemconfiguration, such asmax_response_row_limit,allow_multiple_instances,allowed_instance_keys.
2.7.x Releases¶
The 2.7 series greatly improved indexing, numerics support, added the bytes type, and allowed for bypassing dtype enforcement (Pipe.enforce) as well as introducing persistent Webterm sessions.
v2.7.9 โ v2.7.10¶
-
Add persistent Webterm sessions.
On the Web Console, the Webterm will attach to a persistent terminal for the current session's user. -
Reconnect Webterms after client disconnect.
If a Webterm socket connection is broken, the client logic will attempt to reconnect and attach to thetmuxsession. -
Add
tmuxsessions to Webterms.
Webterm sessions now connect totmuxsessions (tied to the user accounts). Setsystem:webterm:tmux:enabledtofalseto disabletmuxsessions. -
Limit concurrent connections during
verify pipes.
To keep from exhausting the SQL connection pool, limit the number of concurrent intra-chunk connections. -
Return the precision and scale from a table's columns and types.
Reading a table's columns and types withmeerschaum.utils.sql.get_table_columns_types()now returns the precision and scale forNUMERIC(DECIMAL) columns.
v2.7.8¶
- Add support for user-supplied precision and scale for
numericcolumns.
You may now manually specify a numeric column's precision and scale:
1 2 3 4 5 6 7 8 9 10 11 | |
-
Serialize
numericcolumns to exact values during bulk inserts.
Decimal values are serialized when inserting intoNUMERICcolumns during bulk inserts. -
Return a generator when fetching with
SQLConnector.
To alleviate memory pressure, skip loading the entire dataframe when fetching. -
Add
json_serialize_value()to handle custom dtypes.
When serializing documents, passjson_serialize_valueas the default handler:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | |
- Fix an issue with the
WITHkeyword in pipe definitions for MSSQL.
Previously, pipes with used with keywordWITHbut not as a CTE (e.g. to specify an index) were incorrectly parsed.
v2.7.7¶
- Add actions
drop indicesandindex pipes.
You may now drop and create indices on pipes with the actionsdrop indicesandindex pipesor the pipe methodsdrop_indices()andcreate_indices():
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
-
Remove
CAST()to datetime with selecting from a pipe's definition.
For some databases, casting to the same dtype causes the query optimizer to ignore the datetime index. -
Add
INCLUDEclause to datetime index for MSSQL.
This is to coax the query optimizer into using the datetime axis. -
Remove redundant unique index.
The two competing unique indices have been combined into a single index (for the keyunique). The unique constraint (whenupsertis true) shares the name but has the prefixUQ_in place ofIX_. -
Add pipe parameter
null_indices.
Set the pipe parameternull_indicestoFalsefor a performance improvement in situations where null index values are not expected. -
Apply backtrack minutes when fetching integer datetimes.
Backtrack minutes are now applied to pipes with integer datetimes axes.
v2.7.6¶
-
Make temporary table names configurable.
The values for temporary SQL tables may be set inMRSM{system:connectors:sql:instance:temporary_target}. The new default prefix is'_', and the new default transaction length is 4. The values have been re-ordered to target, transaction ID, then label. -
Add connector completions to
copy pipes.
When copying pipes, the connector keys prompt will offer auto-complete suggestions. -
Fix stale job results.
When polling for job results, the job result is dropped from in-memory cache to avoid overwriting the on-disk result. -
Format row counts and seconds into human-friendly text.
Row counts and sync durations are now formatted into human-friendly representations. -
Add digits to
generate_password().
Random strings frommeerschaum.utils.misc.generate_password()may now contain digits.
v2.7.3 โ v2.7.5¶
- Allow for dynamic targets in SQL queries.
Include a pipe definition in double curly braces (ร la Jinja) to substitute a pipe's target into a templated query.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | |
-
Add
--skip-enforce-dtypes.
To override a pipe'senforceparameter, pass--skip-enforce-dtypesto a sync. -
Add bulk inserts for MSSQL.
To disable this behavior, setsystem:connectors:sql:bulk_insert:mssqltofalse. Bulk inserts for PostgreSQL-like flavors may now be disabled as well. -
Fix altering multiple column types for MSSQL.
When a table has multiple columns to be altered, each column will have its ownALTER TABLEquery. -
Skip enforcing custom dtypes when
enforce=False.
To avoid confusion, special Meerschaum data types (numeric,json, etc.) are not coerced into objects whenenforce=False. -
Fix timezone-aware casts.
A bug has been fixed where it was possible to mix timezone-aware and -naive casts in a single query. This patch ensures that this no longer occurs. -
Explicitly cast timezone-aware datetimes as UTC in SQL syncs.
By default, timezone-aware columns are now cast as time zone UTC in SQL. This may be skipped by settingenforcetoFalse. -
Added virtual environment inter-process locks.
Competing processes now cooperate for virtual environment verification, which protects installed packages.
v2.7.0 โ v2.7.2¶
- Introduce the
bytesdata type.
Instance connectors which support binary data (e.g.SQLConnector) may now take advantage of thebytesdtype. Other connectors (e.g.ValkeyConnector) may usemeerschaum.utils.dtypes.serialize_bytes()to store binary data as a base64-encoded string.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | |
- Allow for pipes to use the same column for
datetime,primary, andautoincrement=True.
Pipes may now use the same column as thedatetimeaxis andprimarywithautoincrementset toTrue.
1 2 3 4 5 6 7 8 9 | |
-
Only join on
primarywhen present.
When the indexprimaryis set, use the column as the primary joining index. This will improve performance when syncing tables with a primary key. -
Add the parameter
enforce.
The parameterenforce(defaultTrue) toggles data type enforcement behavior. WhenenforceisFalse, incoming data will not be cast to the desired data types. For static datasets where the incoming data is always expected to be of the correct dtypes, then it is recommended to setenforcetoFalseandstatictoTrue.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | |
-
Create the
datetimeaxis as a clustered index for MSSQL, even when aprimaryindex is specififed.
Specifying adatetimeandprimaryindex will create a nonclusteredPRIMARY KEY. Specifying the same column as bothdatetimeandprimarywill create a clustered primary key (tip: this is useful whenautoincrement=True). -
Increase the default chunk interval to 43200 minutes.
New hypertables will use a default chunksize of 30 days (43200 minutes). -
Virtual environment bugfixes.
Existing virtual environment packages are backed up before re-initializing a virtual environment. This fixes the issue of disappearing dependencies. -
Store
numericasTEXTfor SQLite and DuckDB.
Due to limited precision,numericcolumns are now stored asTEXT, then parsed intoDecimalobjects upon retrieval. -
Show the Webterm by default when changing instances.
On the Web Console, changing the instance select will make the Webterm visible. -
Improve dtype inference.
2.6.x Releases¶
The 2.6 series added the primary index, autoincrement, and migrated to timezone-aware datetimes by default, as well as many quality-of-life improvements, especially for MSSQL.
v2.6.17¶
- Add relative deltas to
starting inscheduler syntax.
You may specify a delta in the job schedulerstartingsyntax:
1 | |
-
Fix
drop pipesfor pipes on custom schemas.
Pipes created under a specific schema are now correctly dropped. -
Enhance editing pipeline jobs.
Pipeline jobs now provide the job label as the default text to be edited. Pipeline arguments are now placed on a separate line to improve legibility. -
Disable the progress timer for jobs.
Thesync pipesprogress timer will now be hidden when running through a job. -
Unset
MRSM_NOASKfor daemons.
Now that jobs may accept user input, the environment variableMRSM_NOASKis no longer needed for jobs run as daemons (executorlocal). -
Replace
Cx_Oraclewithoracledb.
The Oracle SQL driver is no longer required now that the default Python binding for Oracle isoracledb. -
Fix Oracle auto-incrementing for good.
At long last, the mystery of Oracle auto-incrementing identity columns has been laid to rest.
v2.6.15 โ v2.6.16¶
-
Fix inplace syncs without a
datetimeaxis.
A bug introduced by a performance optimization has been fixed. Inplace pipes without adatetimeaxis will skip searching for date bounds. Settingupserttotruewill bypass this bug for previous releases. -
Skip invoking
get_sync_time()for pipes without adatetimeaxis.
Invoking an instance connector'sget_sync_time()method will now only occur whendatetimeis set. -
Remove
guess_datetime()check fromSQLConnector.get_sync_time().
Because sync times are only checked for pipes with a dedicateddatetimecolumn, theguess_datetime()check has been removed from theSQLConnector.get_sync_time()method. -
Skip persisting default
targetto parameters.
The default target table name will no longer be persisted toparameters. This helps avoid accidentally setting the wrong target table when copying pipes. -
Default to "no" for syncing data when copying pipes.
The actioncopy pipeswill no longer sync data by default, instead requiring an explicit yes to begin syncing. -
Fix the "Update query" button behavior on the Web Console.
Existing but null keys are now accounted for when update a SQL pipe's query. -
Fix another Oracle autoincrement edge case.
Resetting the autoincrementing primary key value on Oracle will now behave as expected.
v2.6.10 โ v2.6.14¶
-
Improve datetime timezone-awareness enforcement performance.
Datetime columns are only parsed for timezone awareness if the desired awareness differs. This drastically speeds up sync times. -
Switch to
tz_localize()when stripping timezone information.
The previous method of using a lambda to replace individualtzinfoattributes did not scale well. Usingtz_localize()can be vectorized and greatly speeds up syncs, especially with large chunks. -
Add
enforce_dtypestoPipe.filter_existing().
You may optionally enforce dtype information duringfilter_existing(). This may be useful when implementing custom syncs for instance connectors. Note this may impact memory and compute performance.
1 2 3 4 5 6 7 8 9 10 11 | |
-
Fix
query_df()for null parameters.
This is useful for when you may usequery_df()with onlyselect_columnsoromit_columns. -
Fix autoincrementing IDs for Oracle SQL.
-
Enforce security settings for creating jobs.
Jobs and remote actions will only be accessible to admin users when running with--secure(system:permissions:actions:non_adminin config).
v2.6.6 โ v2.6.9¶
-
Improve metadata performance when syncing.
Syncs via the SQLConnector now cache schema and index metadata, speeding up transactions. -
Fix upserts for MySQL / MariaDB.
Upserts in MySQL and MariaDB now useON DUPLICATEinstead ofREPLACE INTO. -
Fix dtype detection for index columns.
A bug where new index columns were incorrectly created asINThas been fixed. -
Delete old keys when dropping Valkey pipes.
Dropping a pipe from Valkey now clears all old index keys. -
Fix timezone-aware enforcement bugs.
v2.6.1 โ v2.6.5¶
- Add
Pipe.tzinfo.
Check if a pipe is timezone-aware withtzinfo:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | |
- Improve timezone enforcement when syncing.
- Fix inplace syncs with
upsert=True. - Fix timezone-aware datetime truncation for MSSQL
- Fix timezone detection for existing timezone-naive tables.
v2.6.0¶
-
Enforce a timezone-aware
datetimeaxis by default.
Pipes now enforce timezone-naive datetimes as UTC, even if the underlying column type is timezone-naive. To use datetime-naive datetime axes, you must explicitly set thedtypetodatetime64[ns]. -
Designate the index name
primaryfor primary keys.
Like thedatetimeindex, theprimaryindex is used for joins and will be created as the primary key in new tables.
1 2 3 4 5 6 7 8 9 10 11 12 | |
- Add
autoincrementtoPipe.parameters
Likeupsert, you may designate an incremental integer primary key by settingautoincrementtoTruein the pipe parameters. Note thatautoincrementwill beTrueif you specify aprimaryindex but do not specify a dtype or pass into the initial dataframe. This is only available forsqlpipes.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | |
- Add option
statictoPipe.parametersto disable schema modification.
SetstatictoTruein a pipe's parameters to prevent any modification of the column's data types.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | |
- Add
get_create_table_queries()to build fromdtypesdictionaries.
You may get aCREATE TABLEquery from adtypesdictionary (in addition to aSELECTquery). The functionmeerschaum.utils.sql.get_create_table_query()now also accepts an argumentsprimary_keyandautoincrementto designate a primary key column.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | |
-
Create a multi-column index for columns defined in
Pipe.columns.
To disable this behavior, setuniquetoNoneinPipe.indices. -
Default to
BITfor boolean columns in MSSQL.
The previous workaround was to storeboolcolumns asINT. This change now defaults toBITwhen creating new tables. Boolean columns cannot be nullable for MSSQL. -
Improve file protection in
edit config.
Writing an invalid config file will now stop you before committing the changes. The previous behavior would lead to data loss. -
Catch exceptions when creating chunk labels.
If a datetime bound cannot be determined for a chunk, returnpd.NA.
2.5.x Releases¶
The 2.5.x series was short and sweet, primarily introducing features relating to Pipe.indices.
v2.5.1¶
-
Update index information in the pipe card.
TheIndicessection of the pipe card on the web console includes more detailed information, such as composite and multi-column indices. -
Print action results during scheduled jobs.
Scheduled actions now print their result success tuples after firing. -
Other bugfixes.
A few bugs from the migration ofAPSchedulerto internal management have been fixed.
v2.5.0¶
- Add
indicestoPipe.parameters.
You may now explicitly state the indices to be created by definingindices(orindexes) inPipe.parameters(or thePipeconstructor for your convenience).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | |
You may also use the key index_template to change the format of the generated index names (defaults to IX_{target}_{column_names}, where target is the table name and column_names consists of all of the index's columns joined by an underscore).
1 2 3 | |
-
Enable chunking for MSSQL
To improve memory usage,chunksizeis now accepted bySQLConnector.read()for the flavormssql. -
Disable
pyodbcpooling.
To properly recycle the connection pool in the SQLAlchemy engine, the internalpyodbcpooling must be disabled. See the SQLAlchemy documentation. -
Bugfixes
Other miscellaneous bugfixes have been included in this release, such as resolving broken imports during certain edge cases.
2.4.x Releases¶
The 2.4.x series added the ValkeyConnector, relative --begin and --end, pipeline timeouts, and improved MSSQL support.
v2.4.13¶
- Add
--timeoutto pipeline arguments.
You may now designate the maximum number of seconds to run a pipeline with--timeout. This will run the entire pipeline in a subprocess rather than a persistent session.
1 | |
-
Add auto-complete to
edit jobsandbootstrap jobs. -
Improve the editing experience for
edit jobsandbootstrap jobs. -
Fixed plugin detection for Python 3.9.
v2.4.12¶
-
Add the actions
edit jobsandbootstrap jobs.
The actionedit jobslets you easily tweak the arguments for an existing job, so there's no need to delete and recreate jobs. Thebootstrap jobswizard also gives you a chance to review your changes before starting a job. -
Fix nested CTEs for MSSQL.
Pipes may now use definitions containing aWITHclause for Microsoft SQL Server. -
Added
wrap_query_with_ctetomeerschaum.utils.sql.
Reference a subquery in an encapsulating parent query, even if the subquery contains CTEs itself.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | |
-
Fix
--yeswhen running in background jobs.
The flags--yesand--noaskare now properly handled when running a background job which contains prompts. -
Add an external page for jobs to the Web Console.
Like the shareable/pipes/links, you may now link to a specific job at the path/dash/job/{name}. Click the name of the job on the card to open a job in a new tab. -
Preserve the original values for
--beginand--end.
When creating jobs in the shell, the original string values for--beginand--endwill be preserved, such as in the case of--begin 1 month ago. -
Fix
Pipeformatting for small terminals.
Pipes with long names are now properly rendered in small terminal windows. -
Enable shell suggestions for chained actions.
The shell auto-complete now works with chained actions.
v2.4.9 โ v2.4.11¶
- Add relative formats to
--beginand--end.
The flags--beginand--endsupport values in the format[N] [unit] ago:
1 | |
Add a second delta format (recommended to be denoted by the keyword rounded) to round the timestamp to a clean value:
1 | |
Supported units are seconds, minutes, hours, days, weeks, months (ago only), and years.
-
Respect
--begin,--end, and--paramsinshow rowcounts.
The flags--begin,--end, and--paramsare now handled in the actionshow rowcounts. -
Fix an issue with
Pipe.get_backtrack_data().
An incorrect calculation was fixed the produce the correct backtrack interval.
v2.4.8¶
-
Allow for syncing against
DATETIMEOFFSETcolumns in MSSQL.
When syncing an existing table with aDATETIMEOFFSETcolumn, timestamps correctly coerced into timezone-naive UTC timestamps. Note this behavior will likely change to timezone-aware-by-default in a future release. -
Default to
DATETIME2for MSSQL.
To preserve precision, MSSQL now creates datetime columns asDATETIME2. -
Remove temporary tables warnings.
Failure to drop temporary tables no longer raises a warning (addedIF EXISTScheck). -
Fix an issue with the
sqlaction. -
Fix UUID support for SQLite, MySQL / MariaDB.
-
Set
IS_THREAD_SAFEtoFalsefor Oracle.
v2.4.6 โ v2.4.7¶
-
Prefix temporary tables with
##.
Temporary tables are now prefixed with##to take advantage oftempdbin MSSQL. -
Add the
uuiddtype.
Theuuiddtype adds support for PythonUUIDobjects and maps to the appropriateUUIDdata type per SQL flavor (e.g.UNIQUEIDENTIFIERformssql). -
Add
upsertsupport to MSSQL.
Settingupsertin a pipe's parameters will now upsert rows in a single transaction (via aMERGEquery). -
Add
SQLConnector.get_connection().
To simplify connection management, you may now obtain an active connection withSQLConnector.get_connection(). To force a new connection, passrebuild=True. -
Improve session management for MSSQL.
Transactions and connections are now more gracefully handled when working with MSSQL.
v2.4.2 โ v2.4.5¶
-
Fix
bootstrap connectors.
Revert a breaking change to thebootstrap connectorswizard. -
Respect disabling
uvfor package installation.
Settingsystem:experimental:uv_piptofalsewill now disableuvfor certain. -
Default to a query string for
optionswhen bootstrapping MSSQL connectors.
Although dictionaries are supported foroptions, using a dictionary as a default was breaking serialization. The default foroptionsis now the stringdriver=ODBC Driver 17 for SQL Server&UseFMTONLY=Yes. -
Default to MSSQL ODBC Driver 18.
The default driver to be used by MSSQL connectors is version 18.
v2.4.1¶
-
Add
instanceto the external pipe links.
When sharing pipe links on the Web Console, the instance will now be included in the URL. -
Fix an issue with remote actions.
An import error has been patched.
v2.4.0¶
-
Add
valkeyinstance connectors.
Introducing a new first-class instance connector: theValkeyConnector. Valkey, a fork of Redis, is a high-performance in-memory database often used for caching. Thevalkeyservice has been added to the Meerschaum stack and is accessible via the built-in connectorvalkey:main. -
Cache Web Console sessions in Valkey when running with
--production.
Starting the web API with--productionwill now store sessions invalkey:main. This results in a smoother experience in the event of a web server restart. By default, sessions expire after 30 days.You may disable this behavior by setting
system:experimental:valkey_session_cachetofalse. -
Allow for a default executor.
Setting the keymeerschaum:executorwill set the default executor (overriding the check forsystemd). This is useful for defaulting to remote actions in a multi-node deployment. -
Allow querying for
Noneinquery_df().
You may now query for null rows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
-
Improve
query_df()performance.
Dataframe vlues are no longer serialized by default inquery_df(), meaning that parameters must match the data type. Passcoerce_types=Trueto restore legacy behavior. -
Add
Pipe.copy_to().
Copy pipes between instances withPipe.copy_to():
1 2 3 4 | |
- Add
include_unchanged_columnstoPipe.filter_existing().
Passinclude_unchanged_columns=Trueto return entire documents in the update dataframe. This is useful for situations where you are unable to update individual fields:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | |
-
Add a share button to the Pipe card.
On the web dashboard, you may now more easily share pipes by clicking the "share" icon and copying the URL. This opens the pipe card in a new, dedicated tab. -
Add
OPTIONAL_ATTRIBUTESto connectors.
Connectors may now setOPTIONAL_ATTRIBUTES, which will add skippable prompts inbootstrap connector. -
Remove progress bar for syncing via remote actions.
Executingsync pipesremotely will no longer print the timer progress bar. -
Fix bug with
stackin the shell.
Note thatstackactions may not be chained. -
Fix scheduler dependency.
To fix the installation ofAPScheduler,attrsis now held back to 24.1.0.
2.3.x Releases¶
The 2.3 series was short but brought significant improvements, notably the Job API, remote jobs, and action chaining.
v2.3.5 โ v2.3.6¶
-
Properly handle remote jobs.
Long-running remote jobs are now properly handled, allowing for graceful API shutdown. -
Detect when creating a remote pipeline.
Running a pipeline action with a remote executor will pass through the pipeline to the API server:
1 | |
-
Remove actions websocket endpoint with temporary jobs.
-
Properly quote environment variables in
systemdservices. -
Remove
~/.localvolume fromapiservice in the stack.
This was overwriting the new Docker image version and such needed to be removed.
v2.3.0 โ v2.3.4¶
- Add the
Jobclass.
You may now manage jobs withJob:
1 2 3 | |
If you are running on systemd, jobs will be created as user services. Otherwise (e.g. running in Docker) jobs are created as Unix daemons and kept alive by the API server.
You may choose the executor with -e (--executor-keys). Supported values are local, systemd, and the keys for any API instance. See the jobs documentation for more information.
- Chain actions with
+.
Run multiple commands by joining them with+, similar to&&inbashbut with better performance (one process).
1 | |
Adding -d (--daemon) will escape these joiners and run all of the chained commands in the job:
1 | |
- Run chained actions as a pipeline with
:.
You can schedule chained actions by adding:to the end of your command:
1 | |
Other supported flags are --loop, --min-seconds, and the number of times to run the pipeline (e.g. x2 or 2):
1 | |
1 | |
-
Add
--restart.
Your job will be automatically restarted if you use any of flags--loop,--schedule, or--restart. -
Execute actions remotely.
You may execute an action on an API instance by setting the executor to the connector keys. You may run theexecutorcommand in the Meercshaum shell (likeinstance) or pass the flag-e(--executor-keys).
1 | |
The output is streamed directly from the API instance (via a websocket).
- Add
from_plugin_import().
You may now easily access attributes from a plugin's submodule withmeerschaum.plugins.from_plugin_import().
1 2 3 | |
2.2.x Releases¶
The 2.2.x series introduced new features improvements, such as the improved scheduler, the switch to uv, the @dash_plugin and @web_page() decorators, and much more.
v2.2.7¶
-
Fix daemon stability.
Broken file handlers are now better handled, and this should keep background jobs from crashing. -
Improve
show jobsoutput.
Theshow jobstable now includes theSuccessTupleof the most recent run (when jobs are stopped). -
Use a plugin's
__doc__string as the default description.
When registering a new plugin, the__doc__string will be used as the default value for the description. -
Pipes without connectors are no longer considered errors when syncing.
When a pipe has an ordinary string in place of a connector (e.g. externally managed), return early and consider success rather than throwing an error.
1 2 3 4 | |
-
Add update announcements.
When new Meerschaum releases become available, you will now be presented with an update message when starting the shell. Update checks may be disabled by settingshell:updates:check_remotetofalse. -
Enforce a 10-minute max timeout for
APIConnectors.
v2.2.6¶
-
Fix a critical login issue.
The previous release (v2.2.5) broke the login functionality of the Web UI and has been yanked. If you are running v2.2.5, it is urgent that you upgrade immediately. -
Add environment variable
MRSM_CONFIG_DIR.
You may now isolate your configuration directory outside of the root (like withMRSM_PLUGINS_DIR, andMRSM_VENVS_DIR). This will be useful in certain production deployments where secrets need to be segmented and isolated. -
Add
register connector.
Likebootstrap connector, you may now programmatically create connectors. -
Allow for job names to contain spaces and parentheses.
Jobs may now be created with more dynamic names. This issue in particular affected Meerschaum Compose. -
Allow for type annotations in
required.
Plugins may now annotaterequired:
1 2 3 4 | |
-
Automatically include
--noaskand--yesin remote actions.
For your convenience, the flags--noaskand--yesare included in remote actions sent byAPIConnector.do_action(). -
Fixed an issue with URIs for
apiconnectors.
Creating anAPIConnectorvia a URI connection string now properly handles the protocol. -
Fixed a formatting issue with
show logs.
v2.2.5¶
-
Add
bootstrap plugin.
Thebootstrap pluginwizard provides a convenient way to create new plugins from templates. -
Add
edit plugin.
The actionedit pluginwill open a plugin's source file in your editor ($EDITORorpyvim). -
Allow actions,
fetch(), andsync()to omitpipeand**kwargs.
Adding**kwargs(andpipe) is now optional, and you may instead explicitly state only the arguments required.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
- Fixed minor bug with subaction detection.
Subaction functions must now explicitly begin with the name of the parent action (underscore prefix allowed).
1 2 3 4 5 6 7 8 9 10 11 | |
-
Allow for
--begin None.
Explicitly setting--begin Nonewill now passNonetofetch(). -
Persist user packages in stack Docker container.
The stack Docker Compose file now persists user-level packages (under~/.local). -
Throw a warning if a
@dash_pluginfunction raises an exception. -
Added connector type to
show connectors.
Append a connector type to theshow connectorscommand (e.g.show connectors sql) to see only connectors of a certain type. -
Allow
dprintto be imported frommeerschaum.utils.warnings.
For convenience, you may now importdprintalongsideinfo,warn, anderror.
1 | |
- Add positional arguments filtering (
filter_positional()andfilter_arguments())
In addition to keyword argument filtering, you may now filter positional arguments.
1 2 3 4 5 6 7 8 9 10 | |
- Cleaned up OAuth flow (
/login).
v2.2.2 โ v2.2.4¶
-
Speed up package installation in virtual environments.
Dynamic dependencies will now be installed viauv, which dramatically speeds up installation times. -
Add sub-cards for children pipes.
Pipes withchildrendefined now include cards for these pipes under the Parameters menu item. This is especially useful when working managing pipeline hierarchies. -
Add "Open in Python" to pipe cards.
Clicking "Open in Python" on a pipe's card will now launchptpythonwith the pipe object already created.
1 2 3 4 | |
- Add the decorators
@web_pageand@dash_plugin.
You may now quickly add your own pages to the web console by decorating your layout functions with@web_page:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | |
-
Use
ptpythonfor thepythonaction.
Rather than opening a classic REPL, thepythonaction will now open aptpythonshell. -
Allow passing flags to venv
ptpythonbinaries.
You may now pass flags directly to theptpythonbinary of a virtual environment (by escaping with[]):
1 | |
- Allow for custom connectors to implement a
sync()method.
Like module-levelsync()functions forpluginconnectors, any custom connector may implementsync()instead offetch().
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | |
v2.2.1¶
-
Fix
--schedulein the interactive shell.
The--scheduleflag may now be used from both the CLI and the Shell. -
Fix the
SQLConnectorCLI.
Thesqlaction now correctly opens an interactive CLI. -
Bumped
duckdbto>=1.0.0.
The upstream breaking changes that requiredduckdbto be held back have to do with how indices behave. For now, index creation has been disabled so thatduckdbmay be upgraded to 1.0+.
v2.2.0¶
New Features
- New job scheduler
The job scheduler has been rewritten with a simpler syntax.
1 | |
- Add
show schedule.
Validate your schedules' upcoming timestamps withshow schedule.
1 2 3 4 5 6 7 8 9 | |
- Added timestamps to log file lines.
Log files now prepend the current minute to each line of the file, and the timestamps are also printed when viewing logs withshow logs. To disable this behavio, setMRSM{jobs:logs:timestamps:enabled}tofalse.
You may change the timestamp format under the config keys MRSM{jobs:logs:timestamps:format} (timestamp written to disk) and MRSM{jobs:logs:timestamps:follow_format} (timestamp printed when following via show logs.).
- Add
--skip-deps.
When installing plugins, you may skip dependencies with--skip-deps. This should improve the iteration loop during development.
1 | |
-
Add logs buttons to job cards on the Web UI.
For your convenience, "Follow logs" and "Download logs" buttons have been added to jobs' cards. -
Add a Delete button to job cards on the Web UI.
You may now delete a job from its card (once stopped, that is). -
Add management buttons to pipes' cards.
For your convenience, you may now sync, verify, clear, drop, and delete pipes directly from cards. -
Designate your packages as plugins with the
meerschaum.pluginsentry point.
You may now specify your existing packages as Meerschaum plugins by adding themeerschaum.pluginsEntrypoint to your package metadata:
1 2 3 4 5 6 7 8 9 10 | |
or if you are using pyproject.toml:
1 2 | |
- Pre- and post-sync hooks are printed separately.
The results of sync hooks are now printed right after execution rather than after the sync.
Bugfixes
-
Fixed a filtering bug on the Web UI when changing instances.
When changing instances on the Web Console, the connector, metric, and location choices will reset appropriately. -
Ctrl+C when exiting
show logs.
Pressing Ctrl+C will now exit theshow logsimmediately.
Breaking Changes
-
No longer supporting the old scheduler syntax.
If you have jobs with the old scheduler syntax (e.g. using the keywordbefore), you may need to delete and recreate your jobs with an updated schedule. -
Upgraded to
psycopgfrompsycopg2.
The upgrade topsycopg(version 3) should provide better performance for larger transactions. -
Daemon.cleanup()now returns aSuccessTuple.
Other changes
- Bumped
xterm.jsto v5.5.0. - Added tags to the pipes card.
- Replaced
watchgodwithwatchfiles. - Replaced
rocketrywithAPScheduler. - Removed
pydanticfrom dependencies. - Removed
passlibfrom dependencies. - Bumped default TimescaleDB image to
latest-pg16-oss. - Held back
duckdbto<0.10.3.
2.1.x Releases¶
The 2.1.x series added high-performance upserts, improved numerics support and temporary tables performance, and many other bugfixes and improvements.
v2.1.7¶
- Add
query_df()tomeerschaum.utils.dataframe.
The functionquery_df()allows you to filter dataframes byparams,begin, andend.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | |
get_in_ex_params() to meerschaum.utils.misc.This function parses a standard
params dictionary into tuples of include and exclude parameters.
1 2 3 4 5 6 | |
-
Add
coerce_numerictopipe.enforce_dtypes().
Setting this toFalsewill not cast floats toDecimalif the corresponding dtype isint. -
Improve JSON serialization when filtering for updates.
-
Add
date_bound_onlytopipe.filter_existing().
The argumentdate_bound_onlymeans that samples retrieved bypipe.get_data()will only usebeginandendfor bounding. This may improve performance for custom instance connectors which have limited searchability. -
Add
safe_copytopipe.enforce_types(),pipe.filter_existing(),filter_unseen_df().
By default, these functions will create copies of dataframes to avoid mutating the input dataframes. Settingsafe_copytoFalsemay be more memory efficient. -
Add multiline support to
extract_stats_from_message.
Multiple messages separated by newlines may be parsed at once.
1 2 3 | |
-
Remove
order bycheck in SQL queries. -
Improve shell startup performance by removing support for
cmd2.
The packagecmd2never behaved properly, so support has been removed and only the built-incmdpowers the shell. As such, the configuration keyshell:cmdhas been removed.
v2.1.6¶
- Move
success_tuplefrom arg to kwarg for@post_sync_hookfunctions.
To match the signature of@pre_sync_hookfunctions,@post_sync_hookfunctions now only acceptpipeas the positional argument. The return value of the sync will now be passed as the kwargsuccess_tuple. This allows you to use the same callback function as both the pre- and post-sync hooks.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
- Add
sync_timestampandsync_complete_timestampto sync hooks.
The UTC datetime right before the sync is added to the sync hook kwargs, allowing for linking the two callbacks to the same datetime. For convenience, the UTC datetime is also captured at the end of the sync and is passed assync_complete_timestamp.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | |
-
Improved performance of sync hooks.
Sync hooks are now called asynchronously in their own threads to avoid slowing down or crashing the main thread. -
Rename
durationtosync_durationfor sync hooks.
To avoid potential conflicts, the kwargdurationis prefixed withsync_to denote that it was specifically added to provide context on the sync. -
Allow for sync hooks to return
SuccessTuple.
If a sync hook returns aSuccessTuple(Tuple[bool, str]), the result will be printed.
1 2 3 4 5 6 | |
- Add
is_success_tuple()tomeerschaum.utils.typing.
You can now quickly check whether an object is aSuccessTuple:
1 2 3 | |
- Allow for index-only pipes when
upsert=True.
If all columns are indices andupsertisTrue, then the upsert will insert net-new rows (ignore duplicates).
1 2 3 4 5 6 7 8 | |
-
Allow for prelimary null index support for
upsert=True(inserts only, PostgreSQL only).
Like with regular syncs, upsert syncs now coalesce indices to allow for syncing null values. NOTE: the transaction will fail if a null index is synced again, so this is only for the initial insert. -
Remove automatic instance table renaming.
This patch removes automatic detection and renaming of old instance tables to the new names (e.g.users->mrsm_users). Users migrating from an old installation will need to rename the tables manually themselves.
v2.1.5¶
- Add the action
tag pipes.
Tags may be added or removed with thetag pipesaction. Note that the flag--tagsapplies to existing tags for filtering; flags to be added or removed are positional arguments.
1 2 3 4 5 | |
-
Add
--tagssupport toregister pipes.
The actionregister pipewith the--tagsflag will auto-tag the new pipes. -
Clean up warnings on Python 3.12.
All instances ofdatetime.utcnow()have been replaced bydatetime.now(timezone.utc).replace(tzinfo=None)(to preserve behavior). A full migration to timezone-aware datetimes would have to happen in a minor release. -
Improve timezone-aware datetime support for MSSQL.
Passing a timezone-aware datetime as a date bound for MSSQL should now be fixed. -
Add an explicit
VOLUMEto theDockerfile.
The path/meerschaumis now explicitly set as aVOLUMEin the Docker image. -
Add
--tagsfiltering to theshow tagsaction. -
Improve global
ThreadPoolhandling.
Global pools are now created on a per-worker (and per-class) basis, allowing for switching between workers within the same process. Note that global pools are maintained to allow for nested chunking and the limit the number of connections (e.g. avoid threads creating their own pools). -
Fix a bug when selecting only by negating tags.
Pipes may now be selected by only specifying negated tags. -
Rename
meerschaum.utils.get_pipestomeerschaum.utils._get_pipesto avoid namespace collissions.
The functionsget_pipes()andfetch_pipes_keys()are available at themeerschaum.utilsmodule namespace.
v2.1.3 โ v2.1.4¶
- Add the decorators
@pre_sync_hookand@post_sync_hook.
The new decorators@pre_sync_hookand@post_sync_hooklet you intercept a Pipe immediately before and after a sync, capturing its return tuple and the duration in seconds.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
-
Add the action
show tags.
The actionshow tagswill now display panels of pipes grouped together by common tags. This is useful for large deployments which share common tags. -
Add dropdowns and inputs for flags with arguments to the Web Console.
Leverage the full power of the Meerschaum CLI in the Web Console with the new dynamic flags dropdowns. -
Fix shell crashes in Docker containers.
Reloading the running Meerschaum session from an interactive shell via a Docker container will no longer cause crashes on custom commands. -
Improve reloading times.
Reloading the running Meerschaum session has been sped up by several seconds (due to skipping the internal shell modules). -
Improve virtual environments in the Docker image.
Initial startup of Docker containers on a fresh persistent volume has been sped up due to preloading the default virtual environment creation. Additionally, the environment variable$MRSM_VENVS_DIRhas been unset, reverting the virtual environments to be stored under/meerschaum/venvs.
v2.1.1 โ v2.1.2¶
- Add
upsertfor high-performance pipes.
Settingupsertunderpipe.parameterswill create a unique index and combine the insert and update stages into a single upsert. This is particularly useful for pipes with very large tables.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | |
-
Add internal schema
_mrsm_internalfor temporary tables.
To avoid polluting your database schema, temporary tables will now be created within_mrsm_internal(for databases which support PostgreSQL-style schemas). -
Added
mrsm_temporary_tables.
Temporary tables are logged tomrsm_temporary_tablesand deleted when dropped. -
Drop stale temporary tables.
Temporary tables which are more than 24 hours will be dropped automatically (configurable undersystem:sql:instance:stale_temporary_tables_minutes). -
Prefix instance tables with
mrsm_.
Internal Meerschaum tables still reside on the default schema but will be denoted with amrsm_prefix. The existing instance tables will be automatically renamed (e.g.pipeswill becomemrsm_pipes) to add the prefix (this renaming detection will be removed in later releases). -
Fix an issue with
bootstrap.
Refactoring work for 2.1.0 had broken thebootstrapaction. -
Fix an issue with
pause jobs. -
Fix an issue when selecting inverse pipes.
Null location keys are now coalesced when selecting pipes to produce expected behavior. -
Avoid a system exit when exiting the SQL CLI.
v2.1.0¶
-
Replace
term.jswithxterm.js.
This has been a long time coming. The webterm has been migrated toxterm.jswhich has continuous support fromterm.jswhich was last updated almost 10 years ago. -
Deprecate the legacy web pseudo-terminal.
Clicking the "Execute" button on the web console will now execute the command directly in the webterm. Additionally, changing the instance select will now automatically switch the webterm's context to the desired instance. -
Fix an issue when starting existing jobs.
A bug has been fixed which prevented jobs from restarting specifically by name. -
Add
MRSM_VENVS_DIR.
LikeMRSM_PLUGINS_DIR, you can now designate a virtual environments directory separate from the root directory. This is particularly useful for production deployments, andMRSM_VENVS_DIRhas been set to/home/meerschaum/venvsin the official Docker images to allow for mounting/meerschaumto persistent volumes. -
Allow syncing
NULLvalues into indices.
SyncingNonewithin an index will now be coalesced into a magic value when applying updates.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | |
- Syncing
Decimalobjects will now enforcenumericdtypes.
For example, syncing aDecimalonto a integer column will update the dtype tonumeric, like when syncing a float after an integer.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | |
- Improve
IS NULLandIS NOT NULLchecks forparams.
Mixing null-like values (e.g.NaN,<NA>,None) inparamswill now separate out nulls.
1 2 3 4 5 6 7 8 9 | |
-
Add colors to
mrsm show columns. -
Fix a unicode decoding error when showing logs.
-
Remove
xstaticdependencies.
Thexterm.jsfiles are now bundled as static assets, so theterm.jsfiles are no longer needed. Hurray for removing dependencies! -
Other bugfixes.
A handful of minor bugfixes have been included in this release:- Removed non-connector environment variables like
MRSM_WORK_DIRfrom themrsm show connectorsoutput. - Improving symlinks handling for multi-processed situations (
mrsm start api).
- Removed non-connector environment variables like
2.0.x Releases¶
At long last, 2.0 has arrived! The 2.0 releases brought incredible change, from standardizing chunking to adding Pipe.verify() and Pipe.deduplicate() to introducing first-class numeric support. See the full release notes below for the complete picture.
v2.0.8 โ v2.0.9¶
-
Cast
NonetoDecimal('NaN')fornumericcolumns.
To allow for all-null numeric columns,None(and other null-like types) are coerced toDecimal('NaN'). -
Schema bugfixes.
A few minor edge cases have been addressed when working with custom schemas for pipes. -
Remove
APIConnector.get_backtrack_data().
Since 1.7 released, theget_backtrack_data()method for instance connectors has been optional.NOTE: the
backtrack_dataAPI endpoint has also been removed. -
Other bugfixes.
Issues with changes made to session authentication have been addressed.
v2.0.5 โ v2.0.7¶
- Add the
numericdtype (i.e. support forNUMERICcolumns).
Specifying a column asnumericwill coerce it intodecimal.Decimalobjects. ForSQLConnectors, this will be stored as aNUMERICcolumn. This is useful for syncing a mix of integer and float values.
1 2 3 4 5 6 7 8 9 10 11 | |
NOTE: Due to implementation limits,
numerichas strict precision issues in embedded databases (SQLite and DuckDB:NUMERIC(15, 4)). PostgreSQL-like database flavors have the best support forNUMERIC; MySQL and MariaDB use a scale and precision ofNUMERIC(38, 20), MSSQL usesNUMERIC(28, 10), and Oracle and PostgreSQL are not capped.
- Mixing
intandfloatwill cast tonumeric.
Rather than always casting toTEXT, a column containing a mix ofintandfloatwill be coerced intonumeric.
1 2 3 4 5 6 | |
- Add
schematoSQLConnectors.
Including the keyschemaor as an argument in the URI will use this schema for created tables. The argumentsearch_pathwill also setschema(i.e. for PostgreSQL).
1 2 3 4 5 6 7 8 9 10 11 | |
- Add
schematopipe.parameters.
In addition to the default schema at the connector level, you may override this by settingschemaunderpipe.parameters.
1 2 | |
- Add
schematomeerschaum.utils.sql.sql_item_name().
You may now pass an optionalschemawhen quoting:
1 2 3 | |
-
Add
optionstoSQLConnector.
The keyoptionswill now contain a sub-dictionary of connection options, such asdriver,search_path, or any other query parameters. -
Disable the "Sync Documents" accordion item when the session is not authenticated.
When running the API with--secure, only admin users will be able to access the "Sync Documents" accordion items on the pipes' cards. -
Remove
dtype_backendfromSQLConnector.read().
This argument previously had no effect. When applied, it was coercing JSON columns into strings, so it was removed. -
Remove
meerschaum.utils.daemon.Log.
This had been replaced bymeerschaum.utils.daemon.RotatingLogand had been broken since the 2.0 release. -
Remove
paramsfromPipe.filter_existing().
To avoid confusion, filter parameters are instead derived from the incoming DataFrame. This will improve performance when repeatedly syncing chunks which span the same interval. The default limit of 250 unique values may be configured underpipes:sync:filter_params_index_limit. -
Add
forwarded_allow_ipsandproxy_headersto the web API.
The default valuesforwarded_allow_ips='*'andproxy_headers=Trueare set when running Uvicorn or Gunicorn and will help when running Meerschaum behind a proxy. -
Bump
dash-extensionsto>=1.0.4.
The bug that was holding back the version was due to includingenrich.ServersideTransformin the dash proxy without actually utilizing it.
v2.0.3 โ v2.0.4¶
-
Fix an issue with
--timeout-seconds.
Previous refactoring efforts had broken the--timeout-secondspolling behavior. -
Fix a formatting issue when pretty-printing pipes.
Pipes may now be correctly printed if both single and double quotes appear in a message. -
Allow omitting
portforAPIConnectors.
You may now omit theportattribute forAPIConnectorsto use the protocol-default port (e.g. 443 for HTTPS). Note you will need to delete the keyapi:default:portviamrsm edit configif it's present. -
Add optional
verifykey to API connectors.
Client API connectors may now be used with self-signed HTTPS instances. -
Bump
duckdbto version 0.9.0.
This adds complete support for PyArrow data types to DuckDB.
v2.0.2¶
-
Syncing with
--skip-check-existingwill not apply the backtrack interval.
Because--skip-check-existing(orcheck_existing=False) is guaranteed to produce duplicates, the backtrack interval will be set to 0 when running in insert-only mode. -
Allow for
columnsto be a list.
Note that building a pipe withcolumnsas a list must have the datetime column nameddatetime.
1 2 3 4 | |
-
Bump default SQLAlchemy pool size to 8 connections.
-
Consider the number of checked out connections when choosing workers.
For pipes onsqlinstances,pipe.get_num_workers()will now consider the number of checked out connections rather than only the number of active threads. -
Fix
pipe.get_data(as_dask=True)for JSON columns.
v2.0.1¶
-
Fix syncing bools within in-place SQL pipes.
SQL pipes may now sync bools in-place. For database flavors which lack nativeBOOLEANsupport (e.g.sqlite,oracle,mysql), then the boolean columns must be stated inpipe.dtypes. -
Fix an issue with multiple users managing jobs.
Extra validation was added to the web UI to allow for multiple users to interact with jobs. -
Fix a minor formatting bug with
highlight_pipes().
Improved validation logic was added to prevent incorrectly prepending thePipe(prefix. -
Hold back
pydanticto<2.0.0
Pydantic 2 is supported in all features except--schedule. Untilrocketrysupports Pydantic 2, it will be held back.
v2.0.0¶
Breaking Changes
-
Removed redundant
Pipe.sync_timeproperty.
Usepipe.get_sync_time()instead. -
Removed
SQLConnector.get_pipe_backtrack_minutes().
Usepipe.get_backtrack_interval()instead. -
Replaced
pipe.parameters['chunk_time_interval']withpipe.parameters['verify']['chunk_minutes']
For better security and cohesiveness, the TimescaleDBchunk_time_intervalvalue is now derived from the standardchunk_minutesvalue. This also means pipes with integer date axes will be created with a new default chunk interval of 1440 (was previously 100,000). -
Moved
choose_subaction()intomeerschaum.actions.
This function is for internal use and as such should not affect any users.
Features
- Added
verify pipesand--verify.
The commandmrsm verify pipesormrsm sync pipes --verifywill resync pipes' chunks with different rowcounts to catch any backfilled data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | |
- Added
deduplicate pipesand--deduplicate.
Runningmrsm deduplicates pipesormrsm sync pipes --deduplicatewill iterate over pipes' entire intervals, chunking at the configured chunk interval (seepipe.get_chunk_interval()below) and clearing + resyncing chunks with duplicate rows.
If your instance connector implements deduplicate_pipe() (e.g. SQLConnector), then this method will override the default pipe.deduplicate().
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
- Added
pyarrowsupport.
The dtypes enforcement system was overhauled to add support forpyarrowdata types.
1 2 3 4 5 6 7 8 9 10 11 12 13 | |
- Added
boolsupport.
Pipes may now sync DataFrames with booleans (even on Oracle and MySQL):
1 2 3 4 5 6 7 8 9 10 11 | |
- Added preliminary
dasksupport.
For example, you may now return Dask DataFrames in your plugins, pass intopipe.sync(), andpipe.get_data()now has the flagas_dask.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | |
- Added
chunk_minutestopipe.parameters['verify'].
Likepipe.parameters['fetch']['backtrack_minutes'], you may now specify the default chunk interval to use for verification syncs and iterating over the datetime axis.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | |
- Added
--chunk-minutes,--chunk-hours, and--chunk-days.
You may override a pipe's chunk interval during a verification sync with--chunk-minutes(or--chunk-hoursor--chunk-days).
1 | |
- Added
pipe.get_chunk_interval()andpipe.get_backtrack_interval().
Return thetimedelta(orintfor integer datetimes) fromverify:chunk_minutesandfetch:backtrack_minutes, respectively.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | |
- Added
pipe.get_chunk_bounds().
Return a list ofbeginandendvalues to use when iterating over a pipe's datetime axis.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | |
- Added
--boundedto verification syncs.
By default,verify pipesis unbounded, meaning it will sync values beyond the existing minimum and maximum datetime values. Running a verification sync with--boundedwill bound the search to the existing datetime axis.
1 | |
-
Added
pipe.get_num_workers().
Return the number of concurrent threads to be used with this pipe (with respect to its instance connector's thread safety). -
Added
select_columnsandomit_columnstopipe.get_data().
In situations where not all columns are required, you can now either specify which columns you want to include (select_columns) and which columns to filter out (omit_columns). You may pass a list of columns or a single column, and the value'*'forselect_columnswill be treated asNone(i.e.SELECT *).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | |
-
Replace
daemonikerwithpython-daemon.
python-daemonis a well-maintained and well-behaved daemon process library. However, this migration removes Windows support for background jobs (which was never really fully supported already, so no harm there). -
Added
pause jobs.
In addition tostart jobsandstop jobs, the commandpause jobswill suspend a job's daemon. Jobs may be resumed withstart jobs(i.e.Daemon.resume()). -
Added job management to the UI.
Now that jobs and logs are much more robust, more job management features have been added to the web UI. Jobs may be started, stopped, paused, and resumed from the web console, and their logs are now available for download. -
Logs now roll over and are preserved on job restarts.
Spin up long-running job with peace of mind now that logs are automatically rolled over, keeping five 500 KB files on disk at any moment (you can tweak these values withmrsm edit config jobs). To facilitate this,meershaum.utils.daemon.RotatingFilewas added to provide a generic file-like object, complete with its own file descriptor. -
Starting existing jobs with
-dwill not throw an exception if the arguments match.
Similarly, running without any arguments other than--namewill run the existing job. This matches the behavior ofstart jobs. -
Allow for colon-separated paths in
MRSM_PLUGINS_DIR.
Just likePATHinbash, you may now specify your plugins' paths in a single variable, separated by colons. Unlikebash, however, a blank path will not interpreted as the current directory.
1 | |
- Add
pipe.keys()
pipe.keys()returns the connector, metric, and location keys (i.e.pipe.metawithout theinstance).
1 2 3 | |
- Pipes are now indexable.
Indexing a pipe directly is the same as accessingpipe.attributes:
1 2 3 4 5 6 7 8 9 | |
Other changes
-
Fixed backtracking being incorrectly applied to
--begin.
Application of the backtracking interval has been consolidated intopipe.fetch(). -
Improved data type enforcement for SQL pipes.
A pipe's data types are now passed toSQLConnector.read()when fetching its data. -
Added
meerschaum.utils.sql.get_db_version()andSQLConnector.db_version. -
Moved
print_options()frommeerschaum.utils.miscintomeerschaum.utils.formatting.
This placesprint_options()next toprint_tupleandpprint. A placeholder function is still present inmeerschaum.utils.miscto preserve existing behavior. -
mrsm.pprint()will now pretty-printSuccessTuples. -
Added
calmtoprint_tuple().
Printing aSuccessTuplewithcalm=Truewill use a more muted color scheme and emoji. -
Removed
round_downfromget_sync_time()for instance connectors.
To avoid confusion, sync times are no longer truncated by default.round_downis still an optional keyword argument onpipe.get_sync_time(). -
Created
meerschaum.utils.dtypes. - Added
are_dtypes_equal()tomeerschaum.utils.dtypes. - Added
get_db_type_from_pd_type()tomeerschaum.utils.dtypes.sql. - Added
get_pb_type_from_db_type()tomeerschaum.utils.dtypes.sql. -
Moved
to_pandas_dtype()frommeerschaum.utils.miscintomeerschaum.utils.dtypes. -
Created
meerschaum.utils.dataframe. - Added
chunksize_to_npartitions()tomeerschaum.utils.dataframe. - Added
get_first_valid_dask_partition()tomeerschaum.utils.dataframe. - Moved
filter_unseen_df()frommeerschaum.utils.miscintomeerschaum.utils.dataframe. - Moved
add_missing_cols_to_df()frommeerschaum.utils.miscintomeerschaum.utils.dataframe. - Moved
parse_df_datetimes()frommeerschaum.utils.miscintomeerschaum.utils.dataframe. - Moved
df_from_literal()frommeerschaum.utils.miscintomeerschaum.utils.dataframe. - Moved
get_json_cols()frommeerschaum.utils.miscintomeerschaum.utils.dataframe. - Moved
get_unhashable_cols()frommeerschaum.utils.miscintomeerschaum.utils.dataframe. - Moved
enforce_dtypes()frommeerschaum.utils.miscintomeerschaum.utils.dataframe. - Moved
get_datetime_bound_from_df()frommeerschaum.utils.miscintomeerschaum.utils.dataframe. -
Moved
df_is_chunk_generator()frommeerschaum.utils.miscintomeerschaum.utils.dataframe. -
Refactored SQL utilities.
- Added
format_cte_subquery()tomeerschaum.utils.sql. - Added
get_create_table_query()tomeerschaum.utils.sql. - Added
get_db_version()tomeerschaum.utils.sql. -
Added
get_rename_table_queries()tomeerschaum.utils.sql. -
Moved
choices_docstring()frommeerschaum.utils.miscintomeerschaum.actions. - Fixed handling backslashes for
stackon Windows.
1.7.x Releases¶
The 1.7 series was short and sweet with a big focus on improving the web API. The highlight feature was the integrated webterm, and the series includes many bugfixes and improvements.
v1.7.3 โ v1.7.4¶
-
Fix an issue with the local stack healthcheck.
Due to some edge cases, the local stackdocker-compose.yamlfile would not be correctly formatted untiledit confighad been executed. This patch ensures the files are synced with each invocation ofstack. -
Fix an issue when running the local stack with non-default ports.
Initializing a local stack with a different database port (e.g. 5433) now routes correctly within the Docker compose network (now patching to internal port to 5432). -
Fix
upgrade mrsmbehavior.
Recent changes tostackbroke the automaticstack pullwithinmrsm upgrade mrsm.
v1.7.2¶
-
Fix
role "root" does not existfrom stack logs.
Although the healthcheck was working as expected, the log output was filled withError FATAL: role "root" does not exist. These errors have been fixed. -
Fix
MRSM_CONFIGbehavior when runningstart api --production.
Starting the Web API throughgunicorn(i.e.--production) now respectsMRSM_CONFIG. This is useful for runningstack upwith non-default credentials. -
Added
--insecureas an alias for--no-auth.
To compliment the newly added--secureflag, starting the Web API with--insecurewill bypass authentication. -
Bump default TimescaleDB version to PG15.
The default TimescaleDB version for the Meerschaum stack is nowlatest-pg15-oss. -
Pass sysargs to
docker composeviastack
This patch allows for jumping into theapicontainer:
1 | |
- Added the API endpoint
/healthcheck.
This is used to determine reachability and the health of the local stack.
v1.7.0 โ v1.7.1¶
-
Remove
get_backtrack_data()for instance connectors.
If provided, this method will still override the new generic implementation. -
Add
--keyfileand--certfilesupport.
When starting the Web API, you may now run via HTTPS with--keyfileand--certfile. Older releases required the keys to be set inMRSM_CONFIG. This also brings SSL support for--production(Gunicorn). -
Add the Webterm to the Web Console.
At long last, the webterm is embedded within the web console and is accessible from the Web API at the endpoint/webterm. You must provide your active, authorized session ID to access to the Webterm. -
Add
--securetostart api.
Starting the Web API with--securewill now disallow actions from non-administrators. This is recommend for shared deployments. -
Fixed the registration page on the Web API.
Users should now be able to create accounts from Dockerized deployments. -
Held back
dash-extensions
The recent 1.0.2+ releases have shipped some broken changes, sodash-extensionsis held back to1.0.1until newer releases have been tested. -
Allow for digits in environment connectors.
Connectors defined as environment variables may now have digits in the type.
1 | |
-
Fixed
stackon Windows. -
Fixed a false error with background jobs.
-
Increased the minimum password length to 5.
1.6.x Releases¶
The biggest features of the 1.6.x series were all about chunking and adding support for syncing generators. The series was also full of minor bugfixes, contributing to an even more polished experience. It also was the first release to drop support for a Python version, formally deprecating Python 3.7.
v1.6.16 โ v1.6.19¶
-
Add Pydantic v2 support
The only feature which requires Pydantic v1 is the--scheduleflag, which will throw a warning with a hint to install an older version. The underlying libraries for this feature should have Pydantic v2 support merged soon. -
Bump dependencies.
This patch bumps the minimum required versions fortyping-extensions,rich,prompt-toolkit,rocketry,uvicorn,websockets, andfastapiand loosens the minimum version ofpydantic. -
Fix shell formatting on Windows 10.
Some edge case issues have been patched for older versions of Windows.
v1.6.15¶
- Sync chunks in the
copy pipesaction.
This will help with large out-of-memory pipes.
v1.6.14¶
-
Added healthchecks to
mrsm stack up.
The internal Docker Compose file formrsm stackwas bumped to version 3.9, and secrets were replaced with environment variable references. -
Fixed
--no-authwhen starting the API.
The commandmrsm start api --no-authnow correctly handles sessions.
v1.6.13¶
- Remove
\\u0000from strings when inserting into PostgreSQL.
Replace both\0and\\u0000with empty strings when streaming rows into PostgreSQL.
v1.6.12¶
- Allow nested chunk generators.
This patch more gracefully handles labels for situations with nested chunk generators and adds and explicit test for this scenario.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | |
v1.6.11¶
- Fix an issue with in-place syncing.
When syncing a SQL pipe in-place with a backtrack interval, the interval is applied to the existing data stage to avoid inserting duplicate rows.
v1.6.9 โ v1.6.10¶
-
Improve thread safety checks.
Added checks forIS_THREAD_SAFEto connectors to determine whether to use mutlithreading. -
Fix an issue with custom flags while syncing.
This patch includes better handling of custom flags added from plugins during the syncing process.
v1.6.8¶
- Added
as_iteratortoPipe.get_data().
Passingas_iterator=True(oras_chunks) toPipe.get_data()returns a generator which returns chunks of Pandas DataFrames.
Each DataFrame is the result of a Pipe.get_data() call with intermediate datetime bounds between begin and end of size chunk_interval (default datetime.timedelta(days=1) for time-series / 100,000 IDs for integers).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | |
- Add server-side cursor support to
SQLConnector.read().
Ifchunk_hookis provided, keep an open cursor and stream the chunks one-at-a-time. This allows for processing very large out-of-memory data sets.
To return the results of the chunk_hook callable rather than a dataframe, pass as_hook_result=True to receive a list of values.
If as_iterator is provided or chunksize is None, then SQLConnector.read() reverts to the default client-side cursor implementation (which loads the entire result set into memory).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
- Remove
--sync-chunksand set its behavior as default.
Due to the above changes toSQLConnector.read(),sync_chunksnow defaults toTrueinPipe.sync(). You may disable this behavior with--chunksize 0.
v1.6.7¶
- Improve memory usage when syncing generators.
To more lazily sync chunks from generators,pool.map()has been replaced withpool.imap().
v1.6.6¶
- Issue one
ALTER TABLEquery per column for SQLite, MSSQL, DuckDB, and Oracle SQL.
SQLite and other flavors do not support multiple columns in anALTER TABLEquery. This patch addresses this behavior and adds a specific test for this scenario.
v1.6.5¶
- Allow pipes to sync DataFrame generators.
Ifpipe.sync()receives a generator (forDataFrames, dictionaries, or lists), it will attempt to consume it and sync its chunks in parallel threads (this can be single-threaded with--workers 1). For SQL pipes, this will be capped at your configured pool size (default 5) minus the running number of threads.
This means you may now return generators to large transcations, such as reading a large CSV:
1 2 | |
Any iterator of DataFrame-like chunks will work:
1 2 3 4 5 6 7 | |
This new behavior has been added to SQLConnector.fetch() so you may now confidently sync very large tables between your databases.
NOTE: The default chunksize for SQL queries has been lowered to 100,000 from 1,000,000. You may alter this value with --chunksize or setting the value in MRSM{system:connectors:sql:chunksize} (you can also edit the default pool size here).
-
Fix edge case with SQL in-place syncs.
Occasionally, a few docs would be duplicated when running in-place SQL syncs. This patch increases the fetch window size to mitigate the issue. -
Remove
beginandendfromfilter_existing().
The keyword arguments were interfering with the determined datetime bounds, so this patch removes these flags (albeitbeginwas already ignored) to avoid confusion. Date bounds are solely determined from the contents of the DataFrame.
v1.6.4¶
- Allow for mixed UTC offsets in datetimes.
UTC offsets are now applied to datetime values before timezone information is stripped, which should now reflect accurate values. This patch also fixes edge cases when different offsets are synced within the same transcation.
1 2 3 4 5 6 7 8 9 10 11 | |
- Allow skipping datetime detection.
The automatic datetime detection feature now respects a pipe'sdtypes; columns that aren't of typedatetime64[ns]will be ignored.
1 2 3 4 5 6 7 8 9 | |
- Added utility method
enforce_dtypes().
The DataFrame data type enforcement logic ofpipe.enforce_dtypes()has been exposed asmeerschaum.utils.misc.enforce_dtypes():
1 2 3 4 5 6 | |
-
Performance improvements.
Some of the unnecessarily immutable transformations have been replaced with more memory- and compute-efficient in-place operations. Other small improvements like better caching should also speed things up. -
Removed noise from debug output.
The virtual environment debug messages have been removed to make--debugeasier to read. -
Better handle inferred datetime index.
The inferred datetime index feature may now be disabled by settingdatetimetoNone. Improvements were made to be handle incorrectly identified indices. -
Improve dynamic dtypes for SQLite.
SQLite doesn't allow for modifying column types but is usually dynamic with data types. A few edge cases have been solved with a workaround for altering the table's definition.
v1.6.3¶
- Fixed an issue with background jobs.
A change had broken daemon functionality has been reverted.
v1.6.2¶
-
Virtual environment and
piptweaks.
With upcoming changes topipcoming due to PEP 668, this patch sets the environment variablePIP_BREAK_SYSTEM_PACKAGESwhen executingpipinternally. All packages are installed within virtual environments exceptuvicorn,gunicorn, and those explicitly installed with a venv ofNone. -
Change how pipes are pretty-printed.
Printing the attributes of a single pipe now highlights the keys in blue. -
Fix an issue with
bootstrap pipesand plugins.
When bootstrapping a pipe with a plugin connector, the plugin's virtual environment will now be activated while executing itsregister()function. -
Update dependencies.
The minimum version ofduckdbwas bumped to0.7.1,duckdb-enginewas bumped to0.7.0, andpipwas lowered to22.0.4to accept older versions. Additionally,pandas==2.0.0rc1was tested and confirmed to work, so version 1.7.x of Meerschaum will likely require 2.0+ ofpandasto make use of its PyArrow backend.
v1.6.0 โ v1.6.1¶
Breaking Changes
-
Dropped Python 3.7 support.
The latestpandasrequires 3.8+, so to use Pandas 1.5.x, we have to finally drop Python 3.7. -
Upgrade SQLAlchemy to 2.0.5+.
This includes better transaction handling with connections. Other packages which use SQLAlchemy may not yet support 2.0+. -
Removed
MQTTConnector.
This was one of the original connectors but was never tested or used in production. It may be reintroduced via a futuremqttplugin.
Bugfixes and Improvements
- Stop execution when improper command-line arguments are passed in.
Incorrect command-line arguments will now return an error. The previous behavior was to strip the flags and execute the action anyway, which was undesirable.
1 2 3 4 5 6 | |
-
Allow
bootstrap connectorto create custom connectors.
Thebootstrap connectorwizard can now handle registering custom connectors. It uses theREQUIRED_ATTRIBUTESlist set in the custom connector class when determining what to ask for. -
Allow custom connectors to omit
__init__()
If a connector is created via@make_connectorand doesn't have an__init__()function, the base one is used to create the connector with the correct type (derived from the class name) and verify theREQUIRED_ATTRIBUTESvalues if present. -
Infer a connector's
typefrom its class name.
Thetypeof a connector is now determined from its class name (e.g.FooConnectorwould have a typefoo). When inheriting fromConnector, it is no longer required to explictly pass the type before the label. For backwards compatability, the legacy method still behaves as expected.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
-
Allow connectors to omit a
label.
The default labelmainwill be used iflabelis omitted. -
Add
metakeys to connectors.
Like pipes, themetaproperty of a connector returns a dictionary with the kwargs needed to reconstruct the connector.
1 2 3 | |
-
Remove
NULbytes when inserting into PostgreSQL.
PostgreSQL doesn't supportNULbytes in text ('\0'), so these characters are removed from strings when copying into a table. -
Cache
pipe.exists()for 5 seconds.
Repeated calls topipe.exists()will be sped up due to short-term caching. This cache is invalidated when syncing or dropping a pipe. -
Fix an edge case with subprocesses in headless environments.
Checks were added to subprocesses to prevent using interactive features when no such features may be available (i.e.termios). -
Added
pprint(),get_config(), andattempt_import()to the top-level namespace.
Frequently used functionspprint(),get_config(), andattempt_import()have been promoted to the root level of themeerschaumnamespace, i.e.:
1 2 3 4 | |
- Fix CLI for MSSQL.
The interactive CLI has been fixed for Microsoft SQL Server.
1.5.x Releases¶
The 1.5.x series offered many great improvements, namely the ability to use an integer datetime axis and the addition of JSON columns.
v1.5.8 โ v1.5.10¶
- Infer JSON columns from the first first non-null value.
When determining complex columns (dictionaries or lists), the first non-null value of the dataframe is checked rather than the first row only. This accounts for documents which contain variable keys in the same sync, e.g.:
1 2 3 4 5 6 | |
-
Fix a bug when reconstructing JSON columns.
When rebuilding JSON values after merging, a check is first performed if the value is in fact a string (sometimesNULLSslip in). -
Increase the timeout when determining Python versions.
This fixes some difficult-to-reproduce bugs on Windows.
v1.5.7¶
-
Replace
ast.literal_eval()withjson.loads()when filtering JSON columns.
This patch replaces the use ofstrandast.literal_eval()withjson.dumps()andjson.loads()to preserve accuracy. -
Fix a subtle bug with subprocesses.
The functionrun_python_package()now better handles environment passing and raises a more verbose warning when something goes wrong. -
Allow columns with
'create'in the name.
A security measure previously disallowed certain keywords when sanitizing input. Now columns are allowed to contain certain keywords.
v1.5.3 โ v1.5.6¶
- Pipes now support syncing dictionaries and lists.
Complex columns (dicts or lists) will now be preserved:
1 2 3 4 5 6 | |
You can also force strings to be parsed by setting the data type to json:
1 2 3 4 5 6 7 8 9 10 11 | |
For PostgreSQL-like databases (e.g. TimescaleDB), this is stored as JSONB under the hood. For all others, it's stored as the equivalent for TEXT.
-
Fixed determining the version when installing plugins.
Like therequiredlist, the__version__string must be explicitly set in order for the correct version to be determined. -
Automatically cast
postgrestopostgresql
When aSQLConnectoris built with a flavor ofpostgres, it will be automatically set topostgresql.
v1.5.0 โ v1.5.2¶
- Pipes may now use integers for the
datetimecolumn.
If you use an auto-incrementing integer as your primary key, you may now use that column as your pipe'sdatetimecolumn, just specify thedtypeas anInt64:
1 2 3 4 5 6 7 8 9 10 11 12 13 | |
This applies the same incremental range filtering logic as is normally done on the datetime axis.
- Allow for multiple plugins directories.
You may now set multiple directories forMRSM_PLUGINS_DIR. All of the plugins contained in each directory will be symlinked together into a singlepluginsnamespace. To do this, just setMRSM_PLUGINS_DIRto a JSON-encoded list:
1 | |
-
Better Windows support.
At long last, the color issues plaguing Windows users have finally been resolved. Additionally, support for background jobs has been fixed on Windows, though the daemonization library I use is pretty hacky and doesn't make for the smoothest experience. But at least it works now! -
Fixed unsafe TAR extraction.
A PR about unsafe use oftar.extractall()brought this issue to light. -
Fixed the blank logs bug in
show logs.
Backtracking a couple lines before following the rest of the logs has been fixed. -
Requirements may include brackets.
Python packages listed in a plugin'srequirementslist may now include brackets (e.g.meerschaum[api]). -
Enforce 1000 row limit in
SQLConnector.to_sql()for SQLite.
When inserting rows, the chunksize of 1000 is enforced for SQLite (was previously enforced only for reading). -
Patch parameters from
--paramsinedit pipesandregister pipes.
When editing or registering pipes, the value of--paramswill now be patched into the pipe's parameters. This should be very helpful when scripting. -
Fixed
edit users.
This really should have been fixed a long time ago. The actionedit userswas broken due to a stray import left over from a major refactor. -
Fixed a regex bug when cleaning up packages.
- Removed
show guiandshow modules.
1.4.x Releases¶
The 1.4.x series brought some incredible, stable releases, and the highlight feature was in-place SQL syncs for massive performance improvement. The addition of temporary to Pipes also made using pipes in projects more accessible.
v1.4.14¶
- Added flag
temporarytoPipe(and--temporary).
Pipes built withtemporary=True, will not create instance tables (pipes,users, andplugins) or be able to modify registration. This is particularly useful when creating pipes from existing tables when automatic registration is not desired.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | |
-
Fixed potential security of public instance tables.
The API now refuses to sync or serve data if the target is a protected instance table (pipes,users, orplugins). -
Added not-null check to
pipe.get_sync_time().
Thedatetimecolumn should never contain null values, but just in case,pipe.get_sync_time()now passes a not-null check toparamsfor the datetime column. -
Removed prompt for
valuefrompipe.bootstrap().
The prompt for an optionalvaluecolumn has been removed from the bootstrapping wizard becausepipe.columnsis now largely used as a collection of indices rather than the original purpose of meta-columns. -
Pass
--debugand other flags incopy pipes.
Command line flags are now passed to the new pipe when copying an existing pipe.
v1.4.12 โ v1.4.13¶
-
Fixed an issue when syncing empty DataFrames (#95).
When syncing an empty list of documents,Pipe.filter_existing()would trigger pulling the entire table into memory. This patch adds a check if the dataframe is empty. -
Allow the
datetimecolumn to be omitted in thebootstrapwizard.
Now that thedatetimeindex is optional, the bootstrapping wizard allows users to skip this index. -
Fixed a small issue when syncing to MySQL.
Due to the addition of MySQL 5.7 support in v1.4.11, a slight edge case arose which broke SQL definitions. This patch fixes MySQL behavior when aWHEREclause is present in the definition.
v1.4.11¶
-
Add support for older versions of MySQL.
TheWITHkeyword for CTE blocks was not introduced until MySQL 8.0. This patch uses the older syntax for older versions of MySQL and MariaDB. MySQL 5.7 was added to the test suite. -
Allow for any iterable in
items_str()
If an iterable other than a list is passed toitems_str(), it will convert to a list before building the string:
1 2 3 | |
-
Fixed an edge case with
datetimeset toNone.
This patch will ignore the datetime index even if it was set explicitly toNone. -
Added
Pipe.children.
To complementPipe.parents, setting the parameters keychildrento a list of pipes' keys will be treated the same asPipe.parents:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
- Added support for
type:labelsyntax inmrsm.get_connector().
The factory functionmrsm.get_connector()expects the type and label as two arguments, but this patch allows for passing a single string with both arguments:
1 2 3 | |
- Fixed more edge case bugs.
For example, converting toInt64sometimes breaks with older versions ofpandas. This patch adds a workaround.
v1.4.10¶
-
Fixed an issue with syncing background jobs.
The--nameflag of background jobs with colliding with thenamekeyword argument ofSQLConnector.to_sql(). -
Fixed a datetime bounding issue when
datetimeindex is omitted.
If the minimum datetime value of the incoming dataframe cannot be determined, do not bound theget_data()request. -
Keep existing parameters when registering plugin pipes.
When a pipe is registered with a plugin as its connector, the return value of theregister()function will be patched with the existing in-memory parameters. -
Fixed a data type syncing issue.
In cases where fetched data types do not match the data types in the pipe's table (e.g. automatic datetime columns), a bug has been patched to ensure the correct data types are enforced. -
Added
Venvto the root namespace.
Now you can access virtual environments directly frommrsm:
1 2 3 4 | |
v1.4.9¶
- Fixed in-place syncs for aggregate queries.
In-place SQL syncs which use aggregation functions are now handled correctly. This version addresses differences in column types between backtrack and new data. For example, the following query will now be correctly synced:
1 2 3 4 5 6 7 | |
-
Activate virtual environments for custom instance connectors.
All pipe methods now activate virtual environments for custom instance connectors. -
Improved database connection performance.
Cold connections to a SQL database have been sped up by replacingsqlalchemy_utilswith handwritten logic (JSON for PostgreSQL-like and SQLite). -
Fixed an issue with virtual environment verification in a portable environment.
The portable build has been updated to Python 3.9.15, and this patch includes a check to determine the knownsite-packagepath for a virtual environment ofNoneinstead of relying on the default usersite-packagesdirectory. -
Fixed some environment warnings when starting the API
v1.4.5 โ v1.4.8¶
- Bugfixes and stability improvements.
These versions included several bugfixes, such as patching--skip-check-existingfor in-place syncs and fixing the behavior of--params(build_where()).
v1.4.0 โ v1.4.4¶
- Added in-place syncing for SQL pipes.
This feature is big (enough to warrant a new point release). When pipes with the same instance connector and data source connector are synced, the methodsync_pipe_inplace()is invoked. For SQL pipes, this means the entire syncing process will now happen entirely in SQL, which can lead to massive performance improvements.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
This applies even when the source table's schema changes, just like the dynamic columns feature added in v1.3.0.
To disable this behavior, run the command
edit config systemand set the value under the keysexperimental:inplace_synctofalse.
- Added negation to
--params.
Thebuild_where()function now allows you to negate certain values when prefixed with an underscore (_):
1 2 | |
-
Added
--paramsto SQL pipes' queries.
Specifying parameters when syncing SQL pipes will add those constraints to the fetch stage. -
Skip invalid parameters in
--params.
If a column does not exist in a pipe's table, the value will be ignored in--params. -
Fixed environment issue when starting the Web API with
gunicorn. - Added an emoji to the SQL Query option of the web console.
- Fixed an edge case with data type enforcement.
- Other bugfixes
1.3.x Releases¶
The 1.3.x series brought a tremendous amount of new features and stability improvements. Read below to see everything that was introduced!
v1.3.13¶
-
Fixed an issue when displaying backtrack data on the Web Console.
Certain values likepd.NAwould break the Recent Data view on the Web Console. Now the values are cast to strings before building the table. -
Added YAML and JSON support to editing parameters.
YAML is now the default, and toggle buttons have been added to switch the encoding. Line numbers have also been added to the editors. -
Removed the index column from the CSV downloader.
When the download button is clicked, the dataframe's index column will be omitted from the CSV file. -
Changed the download filename to
<target>.csv.
The download process will now name the CSV file after the table rather than the pipe. -
Web Console improvements.
The items in the actions builder now are presented with a monospace font. Actions and subactions will have underscores represented as spaces. -
Activating the virtual environment
Nonewill not override your current working directory.
This is especially useful when testing the API. Activating the virtual environmentNonewill insert behind your current working directory or''insys.path. -
Added WebSocket Secure Support.
This has been coming a long time, so I'm proud to announce that the web console can now detect whether the client is connecting via HTTPS and (assuming the server has the appropriate proxy configuration) will connect via WSS.
v1.3.10 โ v1.3.12¶
-
Fixed virtual environment issues when syncing.
This one's a doozy. Before this patch, there would frequently be warnings and sometimes exceptions thrown when syncing a lot of pipes with a lot of threads. This kind of race condition can be hard to pin down, so this patch reworks the virtual environment resolution system by keeping track of which threads have activated the environments and refusing to deactivate if other threads still depend on the environment. To enforce this behavior, most manual ivocations ofactivate_venv()were replaced with theVenvcontext manager. Finally, the last stage of each action is to clean up any stray virtual environments. Note: You may still run into the wrong version of a package being imported into your plugin if you're syncing a lot of plugins concurrently. -
Allow custom instance connectors to be selected on the web console.
Provided all of the appropriate interface methods are implemented, selecting a custom instance connector from the instance dropdown should no longer throw an error. -
Bugfixes and improvements to the virtual environment system.
This patch should resolve your virtual environment woes but at a somewhat significant performance penalty. Oh well, 'tis the price we must pay for correct and determinstic code! -
Fixed custom flags added by
add_plugin_argument().
Refactoring work to decouple plugins from the argument parser had the unintended side effect of skipping over custom flags until after sysargs had already been parsed. This patch ensures all plugins withadd_plugin_argumentin their root module will be loaded before parsing. -
Upgraded
dash-extensions,dash, anddash-bootstrap-components.
At long last,dash-extensionswill no longer need to be held back. -
Added additional websockets endpoints.
The endpoints/ws,/dash/ws/, and/dashwsresolve to the same handler. This is to allow compatability with different versions ofdash-extensions. -
Allow for custom arguments to be added from outside plugins.
The functionadd_plugin_argument()will now accept arguments when executed from outside a plugin. -
Fixed
verify packages.
v1.3.6 โ v1.3.9¶
- Allow for syncing multiple data types per column.
The highlight of this release is support for syncing multiple data types per column. When different data types are encountered, the underlying column will be converted toTEXT:
1 2 3 4 5 6 7 8 9 10 11 12 | |
-
Cleaned up the Web Console.
The web console's navbar is now more mobile-friendly, and a "sign out" button has been added. -
Removed plugins import when checking for environment connectors.
This should make some commands feel more snappy by lazy loading custom connectors. - Fixed an issue with an updated versin of
uvicorn. - Fixed an issue with
docker-compose. - Fixed an issue with
FastAPIon Python 3.7. - Added support for Python 3.11.
- Renamed
meerschaum.actions.argumentstomeerschaum._internal.arguments.
v1.3.4 โ v1.3.5¶
-
Define environment connectors with JSON or URIs.
Connectors defined as environment variables may now have their attributes set as JSON in addition to a URI string. -
Custom Connectors may now be defined as environment variables.
You may now set environment variables for custom connectors defined via@make_connector, e.g.:
1 | |
- Allow for custom connectors to be instance connectors.
Add the propertyIS_INSTANCE = Trueyour custom connector to add it to the official list of instance types:
1 2 3 4 5 6 7 8 | |
-
Install packages for all plugins with
mrsm install required.
The default behavior formrsm install requiredwith no plugins named is now to install dependencies for all plugins. -
Syncing bugfixes.
v1.3.2 โ v1.3.3¶
- Fixed a bug with
beginandendbounds inPipe.get_data().
A safety measure was incorrectly checking if the quoted version of a column was inpipe.get_columns_types(), not the unquoted version. This patch restores functionality forpipe.get_data(). - Fixed an issue with an upgraded version of
SQLAlchemy. - Added a parameters editor the Web UI.
You may now edit your pipes' parameters in the browser through the Web UI! - Added a SQL query editor to the Web UI.
Like the parameters editor, you can edit your pipes' SQL queries in the browser. - Added a Sync Documents option to the Web UI.
You can directly sync documents into pipes on the Web UI. - Added the arguments
order,limit,begin_add_minutes, andend_add_minutestoPipe.get_data().
These new arguments will give you finer control over the data selection behavior. - Enforce consistent ordering of indices in
Pipe.get_data(). - Allow syncing JSON-encoded strings.
This patch allows pipes to sync JSON strings without first needing them to be deserialized. - Fixed an environment error with Ubuntu 18.04.
- Bumped
duckdbandduckdb-engine. - Added a basic CLI for
duckdb.
This will probably be replaced later down the line.
v1.3.1¶
- Fixed data type enforcement issues.
A serious bug in data type enforcement has been patched. - Allow
Pipe.dtypesto be edited.
You can now set keys inPipe.dtypesand persist them withPipe.edit(). - Added
Pipe.update().
Pipe.update()is an alias toPipe.edit(interactive=False). Pipe.delete()no longer deletes local attributes.
It still removesPipe.id, but local attributes will now remain intact.- Fixed dynamic columns on DuckDB.
DuckDB does not allow for altering tables when indices are created, so this patch will drop and rebuild indices when tables are altered. - Replaced
CLOBwithNVARCHAR(2000)on Oracle SQL.
This may require migrating existing pipes to use the new data type. - Enforce integers are of type
INTEGERon Oracle SQL.
Lots of data type enforcement has been added for Oracle SQL. - Removed datetime warnings when syncing pipes without a datetime column.
- Removed grabbing the current time for the sync time if a sync time cannot be determined.
v1.3.0: Dynamic Columns¶
Improvements
-
Syncing now handles dynamic columns.
Syncing a pipe with new columns will trigger anALTER TABLEquery to append the columns to your table:1 2 3 4 5 6 7 8 9 10 11 12 13
import meerschaum as mrsm pipe = mrsm.Pipe('foo', 'bar', instance='sql:memory') pipe.sync([{'a': 1}]) print(pipe.get_data()) # a # 0 1 pipe.sync([{'b': 1}]) print(pipe.get_data()) # a b # 0 1 <NA> # 1 <NA> 1If you've specified index columns, you can use this feature to fill in
NULLvalues in your table:1 2 3 4 5 6 7 8 9 10 11 12 13
import meerschaum as mrsm pipe = mrsm.Pipe( 'foo', 'bar', columns = {'id': 'id_col'}, instance = 'sql:memory', ) pipe.sync([{'id_col': 1, 'a': 10.0}]) pipe.sync([{'id_col': 1, 'b': 20.0}]) print(pipe.get_data()) # id_col a b # 0 1 10.0 20.0 -
Add as many indices as you like.
In addition to the special index column labelsdatetime,id, andvalue, the values of all keys within thePipe.columnsdictionary will be treated as indices when creating and updating tables:1 2 3 4 5 6 7 8 9 10 11 12 13
import meerschaum as mrsm indices = {'micro': 'station', 'macro': 'country'} pipe = mrsm.Pipe('demo', 'weather', columns=indices, instance='sql:memory') docs = [{'station': 1, 'country': 'USA', 'temp_f': 80.6}] pipe.sync(docs) docs = [{'station': 1, 'country': 'USA', 'temp_c': 27.0}] pipe.sync(docs) print(pipe.get_data()) # station country temp_f temp_c # 0 1 USA 80.6 27.0 -
Added a default 60-second timeout for pipe attributes.
All parameter properties (e.g.Pipe.columns,Pipe.target,Pipe.dtypes, etc.) will sync with the instance every 60 seconds. The in-memory attributes will be patched on top of the database values, so your unsaved state won't be lost (persist your state withPipe.edit()). You can change the timeout duration withmrsm edit config pipesunder the keysattributes:local_cache_timeout_seconds. To disable this caching behavior, set the value tonull. -
Added custom indices and Pandas data types to the Web UI.
Breaking Changes
- Removed
Noneas default for uninitalized properties for pipes.
Parameter properties likePipe.columns,Pipe.parameters, etc. will now always return a dictionary, even if a pipe is not registered. Pipe.get_columns()now setserrortoFalseby default.
Pipes are now mostly index-agnostic, so these checks are no longer needed. This downgrades errors in several functions to just warnings, e.g.Pipe.get_sync_time().
Bugfixes
- Always quote tables that begin with underscores in Oracle.
- Always refresh the metadata when grabbing
sqlalchemytables for pipes to account for dynamic state.
1.2.x Releases¶
This series brought many industry-ready features, such as the @make_connector decorator, improvements to the virtual environment system, the environment variable MRSM_PLUGINS_DIR, and much more.
v1.2.9¶
- Added support for Windows junctions for virtual environments.
This included many changes to fix functionality on Windows. For example, the addition of theMRSM_PLUGINS_DIRenvironment variable broke Meerschaum on Windows, because Windows requires administrator rights to create symlinks.
v1.2.8¶
- Custom connectors may now have
register(pipe)methods.
Just like the module-levelregister(pipe)plugin function, custom connectors may also provide this function as a class member. - Print a traceback if
fetch(pipe)breaks.
A more verbose traceback is printed if a plugin breaks during the syncing process. - Cleaned up
sync pipesoutput.
This patch cleans up the syncing process's pretty output. - Respect
--noprettyinsync pipes.
This flag will only print JSON-encoded dictionaries forsync pipes. Tracebacks may still interfere without standard output, however.
v1.2.5 โ v1.2.7¶
Venvcontext managers do not deactivate previously activated venvs.
You can safely useVenvwithout worrying about deactivating your previously activated environments.- Better handling of nested plugin dependencies.
Plugin.get_dependencies()will not trigger an import.
If you want certainty about a plugin's required list, trigger an import manually. Otherwise, it will useast.literal_eval()to determine the required list from the source itself. This only works for statically setrequiredlists.- Provide rich traceback for broken plugins.
If a plugin fails to import, a nice traceback is printed out in addition to a warning. - Only cache
Pipe.dtypesif the pipe exists. - Pass current environment to subprocesses.
This should retain any custom configuration you've set in the main process. - Hard-code port 5432 as the target DB container port in the stack.
Changing the host port now will not change the target port in the container. - Fixed a bug with background jobs and
to_sql().
The--nameflag was conflicting withto_sql(). - Reimplemented
apply_patch_to_config().
This patch removescascadictas a vendored dependency and replaces it with a simpler implementation. - Removed network request for shell connectivity status.
The shell now simply checks for the existence of the connector. This may occasionally print an inaccurate connection status, but the speed benefit is worth it. - Moved
dilland other required dependencies into thesqldependency group. - Replaced
redenginewithrocketry. - Patched
Literalintotypingfor Python 3.7. - Fixed shell commands.
This includes falling back to' '.joininstead ofshlex.joinfor Python 3.7.
v1.2.1 โ v1.2.4¶
- Added the
@make_connectordecorator.
Plugins may now extend the baseConnectorclass to provide custom connectors. For most cases, the built-inpluginconnector should work fine. This addition opens up the internal connector system so that plugin authors may now add new types. See below for a minimal example of a new connector class:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | |
- Allow for omitting
datetimecolumn index.
Thedatetimecolumn name is still highly recommended, but recent changes have allowed pipes to be synced without a dedicated datetime axis. Plenty of warnings will be thrown if you sync your pipes without specifying a datetime index. If a datetime column can be found, it will be used as the index.
1 2 3 | |
v1.2.0¶
Improvements
-
Added the action
start connectors.
This command allows you to wait until all of the specified connectors are available and accepting connections. This feature is very handy when paired with the newMRSM_SQL_XURI environment variables. -
Added
MRSM_PLUGINS_DIR.
This one's been on my to-do list for quite a bit! You can now place your plugins in a dedicated, version-controlled directory outside of your root directory.
Like MRSM_ROOT_DIR, specify a path with the environment variable MRSM_PLUGINS_DIR:
1 2 | |
- Allow for symlinking in URI environment variables.
You may now reference configuration keys within URI variables:
1 | |
-
Increased token expiration window to 12 hours.
This should reduce the number of login requests needed. -
Improved virtual environment verification.
More edge cases have been addressed. -
Symlink Meerschaum into
Pluginvirtual environments.
If plugins do not specifyMeerschaumin therequiredlist, Meerschaum will be symlinked to the currently running package.
Breaking changes
-
API endpoints for registering and editing users changed.
To comply with OAuth2 convention, the API endpoint for registering a user is now a url-encoded form submission to/users/register(/user/editfor editing).You must upgrade both the server and client to v1.2.0+ to login to your API instances.
-
Replaced
meerschaum.utils.sql.update_query()withmeerschaum.utils.sql.get_update_queries().
The new function returns a list of query strings rather than a single query. These queries are executed within a single transaction.
Bugfixes
-
Removed version enforcement in
pip_install().
This changed behavior allows for custom version constraints to be specified in Meerschaum plugins. -
Backported
UPDATE FROMquery for older versions of SQLite.
The current mutable data logic uses anUPDATE FROMquery, but this syntax is only present in versions of SQLite greater than 3.33.0 (released 2020-08-14). This releases splits the same logic intoDELETEandINSERTqueries for older versions of SQLite. -
Fixed missing suggestions for shell-only commands.
Completions for commands likeinstanceare now suggested. -
Fixed an issue with killing background jobs.
The signals were not being sent correctly, so this release includes better job process management.
1.1.x Releases¶
The 1.1.x series brought a lot of great new features, notably connector URI parsing (e.g. MRSM_SQL_<LABEL>), parsing underscores as spaces in actions, and rewriting the Docker image to run at as a normal user.
v1.1.9 โ v1.1.10¶
- Fixed plugins virtual environments.
A typo in v1.1.8 temporarily broke plugins, and this patch fixes that change. - Fixed Meerschaum on Windows.
A change in a previous release allowed for dist-packages for the root user (not advised but supported). The check for root (os.geteuid()) does not exist on Windows, so this patch accounts for that behavior. - Tweaked screen clearing on Windows.
Meerschaum now always usesclearorclson Windows instead of ANSI escape sequences.
v1.1.5 โ v1.1.8¶
- Fixed
MRSM_PATCHbehavior.
In the docker image,MRSM_PATCHis used to overwritehostforsql:main. This patch restores that behavior (with a performance boost). - Fixed virtual environment verification.
This patch prevents circular symlinks. - Fixed
manually_import_module().
Previous refactoring efforts had brokenmanually_import_module(). - Refactoring
While trying to implement multi-thread configuration patching (discarded for the time being), much of the configuration system was cleaned up.
v1.1.1 โ v1.1.4¶
Bugfixes
The first four versions following the initial v1.1.0 release addressed breaking changes and edge cases. Below are some notable issues resolved:
- Fixed broken Docker images.
Changes to the environment and package systems broke functionality of the Docker images. For example, v1.1.0 switched to a stricter package management policy, but this new policy broke the mechanism behind the Docker images (user-level vs venv-level packages). - Verify virtual environments for multiple Python versions.
When a virtual environment is first activated, Meerschaum now verifies that thepythonsymlinks point to the correct versions. This is necessary due to a quirk invenvandvirtualenvwhere activating an existing environment with a different Python version overwrites the existingpythonsymlink. It also ensures that the symlinks specify the correct version number, e.g.python3.10. This bevavior is now automatic but may be invoked withmrsm verify venvs. - Fixed inconsistent environment behavior with
gunicorn.
This one was tricky to troubleshoot. Due to the migration to the user-level Docker image, a subtle bug surfaced where the environment variables forgunicornwere incorrectly serialized. - Fixed slow-performing edge cases in
determine_version().
Inconsistencies in naming conventions in some packages likepygmentsled to failures to quickly determine the version. - Fixed Web API actions.
In v1.1.0, the default virtual environment was pinned tomrsm, and this broke a function which relied on the old inferred default value ofNone. Always remember: explicit is better than implicit. - Fixed
start jobfor existing jobs.
The same naming change brokedaemon_action(). Explcit code is important, folks!
v1.1.0¶
What's New
- Underscores in actions may now be parsed as spaces.
This took way more work than expected, but anyway, custom actions with underscores in the function names are now treated as spaces! Consider the following:
1 2 3 | |
The above action may now be executed as foo bar or foo_bar:
1 | |
- Create a
SQLConnectororAPIConnectordirectly from a URI.
If you already have a connection string, you can skip providing credentials and build a connector directly from the URI. If you omit alabel, then the lowercase form of'<username>@<host>/<database>'is used:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
The APIConnector may also be built from a URI:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
- Define temporary connectors from URIs in environment variables.
If you set environment variables with the formatMRSM_SQL_<LABEL>to valid URIs, new connectors will be available under the keyssql:<label>, where<label>is the lowercase form of<LABEL>:
1 2 | |
You can set as many connectors as you like, and they're treated the same as connectors registered in your permanent configuration.
1 2 3 4 5 6 7 | |
Bugfixes
- Resolved issues with conflicting virtual and base environments.
- Only reinstall a package available at the user-level if its version doesn't match.
This was a subtle bug, but now packages are handled strictly in virtual environments except when an appropriate version is available. This may slow down performance, but the change is necessary to ensure a consistent environment.
Potentially Breaking Changes
- The database file path for SQLite and DuckDB is now required.
When creating aSQLConnectorwith the flavorssqliteorduckdb, the attributedatabase(a file path or:memory:) is now required. - Removed
--configand--root-dir.
These flags were added very early on but have always caused issues. Instead, please use the environment variablesMRSM_CONFIGorMRSM_PATCHfor modifying the runtime configuration, and useMRSM_ROOT_DIRto specify a file path to the root Meerschaum directory. - Virtual environments must be
Nonefor standard library packages.
When importing a built-in module withattempt_import(), specifyvenv=Noneto avoid attempting installation. - The Docker image now runs as
meerschauminstead ofroot.
For improved security, the docker image now runs at a lower privilege.
1.0.x Releases¶
The v1.0.0 release was big news. A ton of features, bugfixes, and perfomance improvements were introduced: for example, v1.0.0 brought support for mutable pipes and data type enforcement. Later releases in the v1.0.x series included --schedule, the Venv context manager, and a whole lot of environment bugfixes.
v1.0.6¶
- Plugins may now have
requirements.txt.
If a plugin contains a file namedrequirements.txt, the file will be parsed alongside the packages specified in therequiredlist. - Added the module
meerschaum.utils.venv.
Functions related to virtual environment management have been migrated frommeerschaum.utils.packagestomeerschaum.utils.venv. - Added the
Venvclass.
You can now manage your virtual environments with theVenvcontext manager:
1 2 3 4 5 6 7 | |
You can also activate the environments for a Plugin:
1 2 3 4 5 | |
- Removed
--isolatedfrompip_install.
Virtual environments will now respect environment variables and your globalpipconfiguration (~/.pip/pip.conf). - Fixed issues for Python 3.7
v1.0.3 โ v1.0.5¶
- Fixed environment bugs.
This patch resolves issues with the environment variablesMRSM_ROOT_DIR,MRSM_CONFIG, andMRSM_PATCHas well as the configuration directoriespatch_configandpermanent_patch_config. - Fixed package management system.
Meerschaum better handles package metadata, resolving some annoying issues. Seemeerschaum.utils.packages.get_module_path()for an example of the improved virtual environment management system. Also,wheelis automatically installed when new packages are installed into new virtual environments. - Set the default venv to
'mrsm'.
In all functions declared inmeerschaum.utils.packages, the default value ofvenvis always'mrsm'. UseNonefor thevenvto use the user's site packages. - Updated dependencies.
- Added
python-dotenvas a dependency. - Fixed a catalog issue with
duckdb. - Updated the testing suite.
- More refactoring.
Code needs to be beautiful!
v1.0.2¶
- Allow
idcolumn to be omitted.
When generating theUPDATEquery, theidcolumn may now be omitted (NOTE: the datetime column will be assumed to be the primary key in this scenario). - Added
--schedule(-sor--cron).
The--scheduleflag (-s) now lets you schedule any command to be executed regulary, not unlike crontab. This can come in handy with--daemon(-d), e.g.:
1 | |
Here is more information on the scheduling syntax.
- Fixed an issue with SQLite.
An issue where the value columns not loading in SQLite has been addressed.
v1.0.1¶
- Added
citusas an official database flavor.
Citus is a distributed database built on PostgreSQL. When anidcolumn is provided, Meerschaum will callcreate_distributed_table()on the pipe's ID index. Citus has also been added to the official test suite. - Changed
enddatetimes to be exclusive.
Theendparameter now generates<instead of<=. This shouldn't be a major breaking change but is important to be aware of. - Bumped
richto v12.4.4.
v1.0.0: Mutable at Last¶
What's New
-
Inserts and Updates
An additional layer of processing separates new rows from updated rows. Meerschaum uses yourdatetimeandidcolumns (if you specified anidcolumn) to determine which rows have changed. Therefore a primary key is not required, as long as thedatetimecolumn is unique or thedatetimeandidcolumns together emulate a composite primary key.Meerschaum will insert new rows as before as well as creating a temporary table (same name as the pipe's target but with a leading underscore). The syncing engine then issues the appropriate
MERGEorUPDATEquery to update all of the rows in a batch.For example, the following lines of code will result in a table with only 1 row:
1 2 3 4 5 6 7 8 9 10 11
>>> import meerschaum as mrsm >>> pipe = mrsm.Pipe('foo', 'bar', columns={'datetime': 'dt', 'id': 'id'}) >>> >>> ### Insert the first row. >>> pipe.sync([{'dt': '2022-06-26', 'id': 1, 'value': 10}]) >>> >>> ### Duplicate row, no change. >>> pipe.sync([{'dt': '2022-06-26', 'id': 1, 'value': 10}]) >>> >>> ### Update the value columns of the first row. >>> pipe.sync([{'dt': '2022-06-26', 'id': 1, 'value': 100}]) -
Data Type Enforcement.
Incoming DataFrames will be cast to the pipe's existing data types, and if you want total control, you can manually specify the Pandas data types for as many columns as you like under thedtypeskey ofPipe.parameters, e.g.:1 2 3 4 5 6 7
columns: datetime: timestamp_utc id: station_id dtypes: timestamp_utc: 'datetime64[ns]' station_id: 'Int64' station_name: 'object' - Allow for
NULLinINTcolumns.
Before pandas v1.3.0, including a null value in an int column would cast it to a float. Nowpd.NAhas been added and is leveraged in Meerschaum's data type inference system. - Plugins respect
.gitignore
When publishing a plugin that is contained in a git repository, Meerschaum will parse your.gitignorefile to determine which files to omit. - Private API Mode.
Require authentication on all API endpoints withstart api --private. - No Authentication API Mode.
Adding--no-authwill disable all authentication on all API endpoints, including the web console.
Bugfixes
-
Plugin packaging
A breaking bug in the process of packaging and publishing Meerschaum plugins has been patched. -
Correct object names in Oracle SQL.
Oracle has finally been brought up to speed with other flavors. - Multi-module plugins fix.
A small but important fix for multi-module plugins has been applied for their virtual environments. - Improved virtual environment handling for Debian systems.
Ifvenvis not available, Meerschaum now better handles falling back tovirtualenv. - Allow for syncing lists of dicts.
In addition to syncing a dict of lists,Pipe.sync()now supports a list of dicts. - Allow for
beginto equalNoneforPipe.fetch().
The behavior of determiningbeginfromPipe.get_sync_time()only takes place whenbeginis omitted, not when it isNone. Nowbegin=Nonewill not add a lower bound to the query.
0.6.x Releases¶
The 0.6.x series brought a lot of polish to the package, namely through refactoring and changing some legacy features to a meet expected behaviors.
v0.6.3 โ v0.6.4: Durable Venvs¶
- Improved durability of the virtual environments.
The functionmeerschaum.utils.packages.manually_import_module()behaves as expected, allowing you to import different versions of modules. More work needs to be done to see if reintroducing import hooks would be beneficial. - Activate plugin virtual environments for API plugins.
If a plugin uses the@api_plugindecorator, its virtual environment will be activated before starting the API server. This could potentially cause problems if you have many API plugins with conflicting dependencies, but this could be mitigated by isolating environments withMRSM_ROOT_DIR. - Changed
nametoimport_namefordetermine_version().
The first argument inmeerschaum.utils.packages.determine_version()has been renamed fromnameto the less-ambiguousimport_name. - Shortened the IDs for API environments.
Rather than a long UUID, each instance of the API server will have a randomly generated ID of six letters. Keep in mind it is randomly generated, so please excuse any randomly generated words. - Removed version enforcement for uvicorn and gunicorn.
Uvicorn has a lot of hidden imports, and using our home-brewed import system breaks things. Instead, we now use the defaultattempt_importbehavior ofcheck_update=False. - Reintroduced
verify packagestosetup.py.
Upgrading Meerschaum will check if the virtual environment packages satisfy their required versions. - Moved
pkg_resourcespatch from the module-level.
In v0.6.3, the monkey-patching forflask-compresshappened at the module level, but this was quickly moved to a lazy patch in v0.6.4. - Bugfixes
Versions 0.6.3 and 0.6.4 were yanked due to some unforeseen broken features. - Bumped several dependencies.
v0.6.0 โ v0.6.2: Robust Plugins and Beautiful Pipes¶
Potentially Breaking Changes
- Renamed
meerschaum.connectors.sql.toolstomeerschaum.utils.sql.
A dummy module was created at the old import path, but this will be removed in future releases. - Migrated to
meerschaum.core.
Important class definitions (e.g.User) have been migrated frommeerschaum._internaltomeerschaum.core. You can still importmeerschaum.core.Pipeasmrsm.Pipe, however. - Moved
meerschaum.actions.shelltomeerschaum._internal.shell.
Finally marked it off the to-do list! Pipe.__str__()andPipe.__repr__()now return stylized strings.
This should make reading logs significantly more pleasant. You can add syntax highlighting back to strings containingPipe()withmeerschaum.utils.formatting.highlight_pipes().
New Features
- Plugins
Exposed themeerschaum.Pluginclass, which will make cross-pollinating between plugins simpler. - Uninstall procedure
Pluginsnow ship with a properuninstall()method. - Sharing dependencies
Plugins may now import dependencies from a required plugin's virtual environment. E.g. if pluginfoorequiresplugin:bar, andbarrequirespandas, thenfoowill be able to importpandas. - Allow other repos for required plugins.
You can now specify the keys of a required plugin's repo following@, e.g.foomay requireplugin:bar@api:main. - Isolate package cache.
Each virtual environment now uses an isolated cache folder. - Handle multiple versions of packages in
determine_version()
When verifying packages, if multipledist-infodirectories are found in a virtual environment, import the package in a subprocess to determine its__version__. - Specify a target table.
Pipe.target(Pipe.parameters['target']) now governs the name of the underlying SQL table.
Bugfixes
- Circular dependency resolver
Multiple plugins may now depend on each other without entering a recursive loop. - Held back
dash_extensionsdue to breaking API changes.
Future releases will migrate todash_extensions>1.0.0. - Fixed
meerschaum.plugins.add_plugin_argument().
Refactoring broke something awhile back; more plugins-focused tests are needed. - Fixed an issue with
fontawesomeandmkdocs-material. - Fixed pickling issue with
mrsm.Pipe.
Documentation
pdocchanges.
Added__pdoc__and__all__to public modules to simplify the package docs.- Lots of cleanup.
Almost all of the docstrings have been edited.
0.5.x Releases¶
The 0.5.x series tied up many loose ends and brought in new features, such as fulling integrating Oracle SQL, rewriting most of the doctrings, and adding tags and key negation. It also added the clear pipes command and introduced the GUI and webterm.
v0.5.14 โ v0.5.15¶
- Added tags.
Pipes may be grouped together pipes. Check the docs for more information. - Tags may be negated.
Like the key negation added in v0.5.13, you can choose to ignore tags by prefacing them with_. - Bugfixes for DuckDB
- Updated documentation.
- Fixed issue with
flask-compress.
When starting the API for the first time, missingflask-compresswill not crash the server.
v0.5.13¶
- Key negation when selecting pipes.
Prefix connector, metric, or location with_to select pipes that do NOT have that key. - Added the
setup pluginscommand.
Run the commandsetup pluginsfollowed by a list of plugins to execute theirsetup()functions. - Renamed pipes' keys methods function.
The functionmeerschaum.utils.get_pipes.methods()is renamed tomeerschaum.utils.get_pipes.fetch_pipes_keys(). - Improved stability for DuckDB.
- Bumped dependencies.
DuckDB, FastAPI, and Uvicorn have been updated to their latest stable versions.
v0.5.11 โ v0.5.12¶
- Improved Oracle support.
Oracle SQL has been fully integrated into the testing suite, and critical bugs have been addressed. - Added the
install requiredcommand.
When developing plugins, run the commandinstall requiredto install the packages in the plugin'srequiredlist into its virtual environment. - Migrated docstrings.
To improve legibility, many docstrings have been rewritten from reST- to numpy-style. This will make browsing docs.meerschaum.io easier.
v0.5.10¶
- Added the
clear pipescommand.
Users may now delete specific rows within a pipe usingpipe.clear(). This new method includes support for the--begin,--end, and--paramsflags. - Changed the default behavior of
--begin.
The--beginflag is now only included when the user specifies and no longer defaults to the sync time. - Added
--timeout-seconds.
The flags--timeout-secondsand--timeoutmake the syncing engine sync each pipe in a separate subprocess and will kill the process if the sync exceeds the number of provided seconds. - Fixed shell argparse bug.
When editing command line arguments within the shell, edge cases no longer cause the shell to exit.
v0.5.6 โ v0.5.9¶
- Added support for
gunicorn.
Gunicorn may be used to manage API processes with the--productionor--gunicornflags. The--productionflag is now default in the Docker image of the API server. - Updated
bootstrap pipesflow.
The interactive bootstrapping wizard now makes use of the newregister()plugins API as well as asking for thevaluecolumn. - Fixed edge cases in
Pipe.filter_existing().
Better enforcement ofNaTas well as--beginand--endnow reduces edge-case bugs and unexpected behavior. - Re-introduced the
fullDocker image.
Inclusion of thestart guicommand led to the full version of the Docker image requiring GTK and dependencies. Now you can forward the GUI withdocker run -e DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix bmeares/meerschaum:full start gui - Added
ACKNOWLEDGEMENTS.mdto the root directory.
Only dynamic dependencies with BSD, LGPL, MIT, and Apache licenses remain. - Fixed plugin installation bug (again).
- Allow for plugins with hyphens in the name.
- Lots of refactoring and tiny bugfixes.
v0.5.3 โ v0.5.5¶
- Refactored the
start guiandstart webtermcommands.
Thestart guicommand opens a window which displays the webterm. This terminal will be integrated into the dashboard later. - Began work on the desktop build.
Work on building with PyOxidizer began on these releases.
v0.5.1 โ v0.5.2¶
- Added the experimental commands
start guiandstart webterm.
The desktop GUI will be rewritten in the future, but for now it starts a webview for the web console. The webterm is an instance ofxterm(notxterm.js, having issues) and will eventually replace the current web console "terminal" output. The desktop GUI will also be replaced and will include the webterm once I can get it working on Windows. - Isolated API processes.
Meerschaum API workers refer to the environment variableMRSM_SERVER_IDto determine the location of the configuration file. This results in the ability to run multiple instances of the API concurrently (at last)! - Fixed plugin installation bug.
When installing plugins, the expected behavior of checking if it's already installed occurs. - Replaced
semver.match()withsemver.VersionInfo.match().
This change resolves depreciation warnings when building the package. - Added the
/infoendpoint to the API.
This allows users to scrape tidbits of information about instances. The current dictionary returns the version and numbers of pipes, plugins, and users. More information will be added to this endpoint in future releases.
v0.5.0¶
- New syncing engine.
Thesync pipescommand reduces concurrency issues while nearly halving syncing times for large batches of pipes. - Syncing progress bar.
Thesync pipescommand displays a progress bar (only in the shell) to track the number of completed pipes. - Bumped default TimescaleDB image to PostgreSQL 14.
You can continue using PostgreSQL 13 if you already have an existing database. - Changed API endpoints.
An endpoint for deleting pipes was added, and the editing and registration endpoints were changed to match the connector, metric, location path scheme. - Redesigned test suite.
Thepytestenvironment now checks syncing, registration, deletion, etc. for pipes and users with many database flavors. - Cleanup and small bugfixes.
As a result of the updated testing suite, issues with several database flavors as well as the API have been resolved.
0.4.x Releases¶
The 0.4.x series dramatically updated Meerschaum, such as ensuring compatibility with Python 3.10, migrating to Bootstrap 5, and implementing useful features like the redesigned web console and the shell toolbar.
v0.4.16 โ v0.4.18¶
- Rewritten API `register()` methods.
- MySQL / MariaDB and CockroachDB fixes.
- Additional tests.
v0.4.11 โ v0.4.15¶
- Change the number of columns when printing items.
Depending on the lengths of items and size of the terminal, the number of columns is reduced until most items are not truncated. - Allow shell jobs with the
-fflag.
In addition to--allow-shell-job, the--forceflag permits non-Meerschaum commands to be run. If these flags are absent, a more informative error message is printed. - Redesigned the bottom toolbar.
The bottom toolbar now uses a black background with white text. Although this technically still prints ANSI when the global ANSI configuration is false, it still does toggle color. - More bugfixes.
A warning when installing plugins has been addressed, and other virtual environment and portable bugs have been fixed.
v0.4.8 โ v0.4.10¶
- Added the bottom toolbar to the interactive shell.
The includes the current instance, repo, and connection status. - Fixed parsing issue with the Docker build.
There is a strange edge case where multiple levels of JSON-encoding needed to be escaped, and this scenario has been accounted for. - Enforce
MRSM_CONFIGandMRSM_PATCHin the Web Console actions.
The Docker version of the API uses environment variables to manage instances, so this information is passed along to children threads. - Delayed imports when changing instances.
This postpones trying to connect to an instance until as late as possible.
v0.4.1 โ v0.4.7¶
- Added features to the Web Console.
Features such as theShow Pipesbutton and others were added to give the Web Console better functionality. - Migrated the Web Console to Bootstrap 5.
Many components needed to be modified or rewritten, but ultimately the move to Bootstrap 5 is worth it in the long run. - Updated to work on Python 3.10.
This included creating a standalone internal module forcascadictsince the original project is no longer maintained. - Tighter security.
Better enforcement of datetimes indateadd_str()and denying users access to actions if the permissions setting does not allow non-admins to perform actions. - Bugfixes for broken dependencies.
In addition to migrating to Bootstrap 5, components likePyYAMLandfastapi-loginchanged their function signatures which broke things.
v0.4.0¶
- Allow for other plugins to be specified as dependencies.
Other plugins from the same repository may be specified in therequiredlist. - Added warnings for broken plugins.
When plugins fail to be imported, warnings are thrown to help authors identify the problem. - Added registration to the Web Console.
New users may create accounts by clicking the No account? link on the login page. - Added the
verifyaction.
For now,verify packagesensures that the installed dependencies meet the stated requirements for the installed version of Meerschaum. - Fixed "ghost" background jobs.
Ensure that jobs are actually running before marking them as so.
0.3.x Releases¶
Version 0.3.0 introduced the web interface and added more robust SQL support for various flavors, including MSSQL and DuckDB.
v0.3.12 โ v0.3.19¶
- Mostly small bugfixes.
Docker-compose fixes,paramsinget_pipe_rowcount(), unique index names for pipes. - Added
newestflag topipe.get_sync_time().
Settingnewest=Falsewill return the oldest time instead of the newest. - Migrated
filter_existingto a member ofPipe.
Although the current implementation for APIConnectors offloads filtering to the SQLConnector, soon filtering will take place locally to save bandwidth. - Updated Docker base image.
Bumped base image from Python 3.7 on Debian Buster Slim to Python 3.9 on Debian Bullseye Slim. Also removed ARM images for the sake of passing builds and reducing build times (e.g. DuckDB fails to compile with QEMU). - Improved DuckDB support.
sql:memoryis now the default in-memory DuckDB instance.
v0.3.1 โ v0.3.11¶
- Improved Microsoft SQL Server support.
- Added plugins page to the dashboard.
Although somewhat hidden away, the path/dash/pluginswill show the plugins hosted on the API repository. If the user is logged in, the descriptions of plugins belonging to that user become editable. - Added locks to resolve race conditions with threading.
- Added
--paramswhen searching for data and backtracked data. - Fixed the
--paramsflag for API pipes. - Added experimental multiplexed fetching feature
To enable this feature, runmrsm edit config systemand under theexperimentalsection, setfetchtotrue. - Bugfixes and stability improvements
v0.3.0¶
-
Introduced the Web Interface.
Added the Meerschaum Web Interface, an interactive dashboard for managing Meerschaum instances. Although not a total replacement for the Meerschaum Shell, the Web Interface allows multiple users to share connectors without needing to remote into the same machine. -
Background jobs
Actions may be run in the background with the-dor--daemonflags or with the actionstart job. To assign a name to a job, pass the flag--name. -
Added
duckdbas a database flavor
Theduckdbdatabase flavor is a single file, similar tosqlite. Future releases may useduckdbas the cache store for local pipes' data. -
Added
uninstall pluginsanduninstall packages.
Plugins and virtual environmentpippackages may now be removed via theuninstallcommand. -
Delete plugin from repository
The commanddelete pluginsnow deletes the archive file and database registration of the plugin on the remote repository. This does not uninstall plugins, so deleted plugins may be re-registered if they are still installed on the client. -
Bound syncing with
--beginand--end
When performing a sync, you can specify--beginand--endto bound the search for retrieving data. -
Bugfixes and improvements
Small bugfixes like including the locationNonewith other locations and improvements like only searching for plugin auto-complete suggestions when the search term is at least 1 character long.
0.2.x Releases¶
Version 0.2 improved greatly on 0.1, with a greater focus on the user experience, plugins, local performance, and a whole lot more. Read the release notes below for some of the highlights.
v0.2.22¶
- Critical bugfixes. Version 0.2.22 fixes some critical bugs that went unnoticed in v0.2.21 and is another backport from the 0.3.x branch.
v0.2.21¶
- Bugfixes and performance improvements. Improvements that were added to v0.3.0 release candidates were backported to the 0.2.x series prior to the release of v0.3.0. This release is essentially v0.3.0 with the Web Interface disabled.
v0.2.20¶
- Reformatted
show columnsto tables.
The actionshow columnsnow displays tables rather than dictionaries. - SQLConnector bugfixes.
Thedebugflag was breaking functionality ofSQLConnectorobjects, but now connectors are more robust and thread safe. - Added
instanceas an alias tomrsm_instancewhen creatingPipeobjects.
For convenience, when buildingPipes,instancemay be used in place ofmrsm_instance.
v0.2.19¶
- Added
show columnsaction.
The actionshow columnswill now display a pipe's columns and data types. docker-composebugfix.
Whendocker-composeis installed globally, skip using the virtual environment version.- Refactoring / linting
A lot of code was cleaned up to conform with cleaner programming practices.
v0.2.18¶
-
Added
loginaction.
To verify or correct login credentials for API instance, run theloginaction. The action will try to log in with your defined usernames and passwords, and if a connector is missing a username or password is incorrect, it will ask if you would like to try different login credentials, and upon success, it will ask if you would like to save the new credentials to the primary configuration file. -
Critical bugfix.
Fixed bug wheredefaultvalues were being copied over from the active shellinstance. I finally found, deep in the code, the missing.copy(). -
Reset
api:mrsmto default repository.
In my task to move everything to the preconfigured instance, I overstepped and made the default repository into the configuredinstance, which by default is a SQLConnector, so that broke things! In case you were affected by this change, you can simply reset the value ofdefault_repositorytoapi:mrsm(or yourapiserver) to return to the desired behavior. -
๐งน Housekeeping (refactoring).
I removed nearly all instances of declaring mutable types as optional values, as well as additionaltypinghints. There may still be some additional cleaning to do, but now the functions are neat and tidy!
v0.2.17¶
- Added CockroachDB as a supported database flavor.
CockroachDB may be a data source or a Meerschaum backend. There may be some performance tuning to do, but for now, it is functional. For example, I may implement bulk insert for CockroachDB like what is done for PostgreSQL and TimescaleDB. - Only attempt to install readline once in Meerschaum portable.
The first Meerschaum portable launch will attempt to install readline, but even in case of failure, it won't try to reinstall during subsequent launches or reloads. - Refactored SQLAlchemy configuration.
Undersystem:connectors:sql, the keycreate_enginehas been added to house all thesqlalchemyconfiguration settings. WARNING: You might need to rundelete config systemto refresh this portion of the config file in case any old settings break things. - Dependency conflict resolution.
- As always, more bugfixes :)
v0.2.16¶
- Hypertable improvements and bugfixes.
When syncing a new pipe, if anidcolumn is specified, create partitions for the number of uniqueidvalues. - Only use
api:mrsmfor plugins, resort to defaultinstancefor everything else. - Fix bug that mirrored changes to
mainunderdefault.
v0.2.15¶
- MySQL/MariaDB bugfixes.
- Added
aiomysqlas a driver dependency.
v0.2.14¶
- Implemented
bootstrap pipesaction.
Thebootstrap pipeswizard helps guide new users through creating connectors and pipes. - Added
edit pipes definitionaction.
Adding the worddefinitionto theedit pipescommand will now open a.sqlfile for pipes withsqlconnectors. - Changed
api_instanceto symlink toinstanceby default. - Registering users applies to instances, not repositories.
The actionregister usersnow uses the value ofinstanceinstead ofdefault_repository. For users to make accounts withapi.mrsm.io, they will have to specify-i api:mrsm.
v0.2.13¶
- Fixed symlink handling for nesting dictionaries.
For example, the environment variables for the API service now contain clean references to themeerschaumandsystemkeys. - Added
MRSM_PATCHenvironment variable.
TheMRSM_PATCHenvironment variable is treated the same asMRSM_CONFIGand is loaded afterMRSM_CONFIGbut before patch or permanent patch files. This allows the user to apply a patch on top of a symlinked reference. In the docker-compose configuration,MRSM_PATCHis used to change thesql:mainhostname todb, and the entiremeerschaumconfig file is loaded fromMRSM_CONFIG. - Bugfixes, improved robustness.
Per usual, many teeny bugs were squashed.
v0.2.12¶
- Improved symlink handling in the configuration dictionary.
Symlinks are now stable and persistent but at this time cannot be chained together. - Improved config file syncing.
Generated config files (e.g. Grafana data sources) may only be edited from the mainedit configprocess. - Upgraded to PostgreSQL 13 TimescaleDB by default.
This may break existing installs, but you can revert back to 12 withedit config stackand changing the stringlatest-pg13-ossunder thedbimage tolatest-pg12-oss. - Bugfixes.
Like always, this release includes miscellaneous bugfixes.
v0.2.11 (release notes before this point are back-logged)¶
- API Chaining
Set a Meerschaum API as a the parent source connector for a child Meerschaum API, as if it were a SQLConnector.
v0.2.10¶
- MRSM_CONFIG critical bugfix
The environment variable MRSM_CONFIG is patched on top of your existing configuration. MRSM_PATH is also a patch that is added after MRSM_CONFIG.
v0.2.9¶
- API and SQL Chunking
Syncing data via an APIConnector or SQLConnector uploads the dictionary or DataFrame in chunks (defaults to a chunksize of 900). When callingread()with a SQLConnector, achunk_hookcallable may be passed, and ifas_chunksisTrue, a list of DataFrames will be returned. Ifas_iteratorisTrue, a dataframe iterator will be returned.
v0.2.8¶
- API Chaining introduction
Chaining is first released on v0.2.8, though it is finalized in 0.2.11.
v0.2.7¶
- Shell autocomplete bugfixes
v0.2.6¶
- Miscellaneous bugfixes and dependency updates
v0.2.1 โ v0.2.5¶
- Shell improvements
Stability, autosuggest, and more. - Virtual environments
Isolate dependencies via virtual environments. The primary entrypoint for virtual environments ismeerschaum.utils.packages.attempt_import().
v0.2.0¶
- Plugins
Introduced the plugin system, which allows users and developers to easily integrate any data source into Meerschaum. You can read more about plugins here. - Repositories
Repositories are Meerschaum APIs that register and serve plugins. To register a plugin, you need a user login for that API instance. - Users
A user account is required for most functions of the Meerschaum API (for security reasons). By default, user registration is disabled from the API side (but can be enabled withedit config systemunderpermissions). You can register users on a direct SQL connection to a Meerschaum instance. - Updated shell design
Added a new prompt, intro, and more shell design improvements. - SQLite improvements
The connectorsql:localmay be used as as backend for cases such as when running on a low-powered device like a Raspberry Pi.
0.1.x Releases¶
Meerschaum's first point release focused on a lot, but mainly stability and improving important functionality, such as syncing.
0.0.x Releases¶
A lot was accomplished in the first 60 releases of Meerschaum. For the most part, the groundwork for core concepts like pipes, syncing, the config system, SQL and API connectors, bulk inserts, and more was laid.