I fixed shell pipes
In a previous post I made pipes in unix shells more reliable. Well, it had some drawbacks. I’ll summarize the problem, the failed previous version, and then show the new and improved one.
Problem summary
Downstream processes in a unix shell pipe cannot know if the upstream finished successfully, or exited with an error. This means that it can’t know if it should “commit” the data it received.
Example uses:
$ pg_dumpall | xz -9 | google_cloud_storage_upload gs://bucket/path/postgres.dump
$ generate_data | psql --single-transaction
In both of these cases you want the right hand side to STOP, and not finalize the upload or commit the transaction.
The previous version
$ goodpipe <<EOF
[
["gsutil", "cat", "gs://example/input-unsorted.txt"],
["sort", "-S300M", "-n"],
["gzip", "-9"],
["gsutil", "cp", "-", "gs://example/input-sorted-numerically.txt.gz"]
]
EOF
This works fine for simple cases, but doesn’t support tee or per-command
environment variables very well.
And I don’t want to invent a complex language, so my replacement took a different path.
wp — Wrap Pipe
wp instead wraps the input and/or output with a very minimal encapsulating
protocol. This allows normal data to pass through, but still allows the
downstream to get EOF as metadata.
If the data stream ends before receiving the EOF marker, then do not
commit. The wrapped downstream child process sees this as stdin remaining
open, and instead it’s getting terminated with a signal.
wp can either encapsulate when it wraps something that outputs data, with
wp -o, or decapsulate and receive the EOF marker when it’s handling input
data, or both.
Examples
$ wp -o pg_dumpall | wp -io xz -9 | wp -i google_cloud_storage_upload gs://bucket/path-postgres.dump
$ wp -o generate_data | wp -i psql --single-transaction
Quick install, if you have cargo
cargo install --locked wp-cli