I wouldn‘t make a post on my blog just so I don’t have to keep googling something would I? Of course I would. It's like…95% of the reason I keep this.
Totally static go builds - these are great for running in Docker containers. The important part is the command line to create them - it‘s varied a bit, but the most thorough I’ve found is this (see this Github Issue):
CGO_ENABLED=0 GOOS=linux go build -a -ldflags '-extldflags "-static"' .
This will create an “as static as possible” binary - beware linking in things which want glibc, since pluggable name resolvers will be a problem (which you can workaround in Docker quite well, but that's another question).
Quickly configuring modelines?
Something hopefully no one should ever have to do in the far distant future, but since I insist on using old-hardware till it drops, it still comes up.
Working from an SSH console on an XBMC box, I was trying to tune in an elusive 1366x768 modeline for an old plasma TV.
The best way to do it is with xrandr these days in a
~/.xprofile script which
is loaded on boot up.
To quickly go through modelines I used the following shell script:
#!/bin/bash xrandr -d :0 --output VGA-0 --mode "1024x768" xrandr -d :0 --delmode VGA-0 "1360x768" xrandr -d :0 --rmmode "1360x768" xrandr -d :0 --newmode "1360x768" $@ xrandr -d :0 --addmode VGA-0 "1360x768" xrandr -d :0 --output VGA-0 --mode "1360x768"
Simply passing in a modeline when running it causes that modeline to be set and applied to the relevant output (VGA-0) in my case.
./tryout 84.750 1366 1480 1568 1800 768 769 776 800 -hsync +vsync
Somehow the installation instructions for Docker never work for me and the website is surprisingly cagey about the manual process.
It works perfectly well if you just grab the relevant bits of that script and run them manually, but usually fails if you let it be a bit too magical.
To be fair, I probably have issues due to the mismatch of LSB release since I run Mint. Still though.
So here's the commands for Ubuntu:
$ apt-key adv --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys 58118E89F3A912897C070ADBF76221572C52609D $ echo deb https://apt.dockerproject.org/repo ubuntu-vivid main > /etc/apt/sources.list.d/docker.list $ apt-get update && apt-get install -y docker-engine
I modified Rocky Bernsteins go-play to compile with go-assetfs and run from a single executable. Get it here!
Why and How
iPython is one of the things I love best about Python. In a dynamically typed language its a huge benefit to be able to quickly and easily paste in chunks of code and investigate what the actual output would be or what an error situation would look like.
Go is not dynamically typed, but many of the same issues tend to apply - when errors rise they can be tricky to introspect without diving through the code, and sometimes the syntax or results of a function call aren't obvious.
As a learning tool, Go provides the Go Playground - a web service which compiles and runs snippets of Go code within a sandbox, which has proven a huge boon to the community for sharing and testing solutions (its very popular on Stack Overflow).
The public Go playground is necessariy limited - and it would be nice to be able to use Go in the same way clientside, or just without internet access.
Fortunately Rocky Bernstein pulled together an unrestricted copy of the Go play ground which runs as a client-side HTML5 app. Unlike the web playground, this allows unrestricted Go execution on your PC and full testing of things as they would work locally. The Github export is found here.
The one problem I had with this was that this version still exposed dependencies on the location of source files outside the executable - which for a tiny tool was kind of annoying. Fortunately this has been solved in Go for a long time - and a little fun with go-bindata-assetfs yielded my own version which once built runs completely locally.
Get it here. It's fully go-gettable too
go get github.com/wrouesnel/go-play will work too.
You have a server you can SSH to. For whatever reason AllowTCPPortForwarding is disabled. You need to forward a port from it to your local machine.
If it's any sort of standard machine, then it probably has
netcat. It's less
likely to have the far more powerful
socat - which we'll only need locally.
This tiny tip servers two lessons: (1) disabling SSH port forwarding is not a serious security measure, and far more of an anoyance. And (2) since it's pretty likely you still need to do whatever job you need to do, it would be nice to have a 1-liner which will just forward the port for you
socat TCP-LISTEN:<local port>,reuseaddr,fork "EXEC:ssh <server> nc localhost <remote port>"
It‘s kind of obvious if you know socat well, but half the battle is simply knowing it’s possible.
Obviously you can change localhost to also be a remote server. And this is really handy if you want to do debugging since socat can echo all data to the console for you if you want.
As I said at the start: if you have standard tools installed, or if your users can upload new tools (which, with shell access they can), and if you don't have firewall rules or cgroups limitations on those accounts, then stuff like disabled port forwards is not a security measure.
When I was travelling Europe I found some surprisingly restricted wi-fi hotspots in hotels. This was annoying because I use SSH to upload photos back home from my phone, but having not setup any tunneling helpers I just had to wait till I found a better one.
There are a number of solutions to SSH tunneling, but the main thing I wanted to do was implement something which would let me run several fallbacks at once. Enter sshttp.
sshttp is related to sslh, in the sense that they are both SSH connection multiplexers. The idea is that you point a web-browser at port 80, you get a web-page. You point your SSH client, and you get an SSH connection. Naive firewalls let the SSH traffic through without complaint.
The benefit of sshttp over sslh is that it uses Linux's
IP_TRANSPARENT flag, which means that your SSH and HTTP logs all show proper source IPs, which is great for auditing and security.
This is a blog about how I set it up for my specific server case, the instructions I used as a guide were adapted from here.
Since I discovered it, I've been in love with the concept behind bup.
bup appeals to my sense of efficiency in taking backups: backups should backup the absolute minimum amount of data so I can have the most versions, and then that frees me to use whatever level of redundancy I deem appropriate for my backup media.
But more then that, the underlying technology of bup is ripe with possibility: the basic premise of a backup tool gives rise to the possibility of a sync tool, a deduplicated home directory tool, distributed repositories, archives and more.
how it works
A more complete explanation can be found on the main GitHub repository, but essentially bup applies rsync's rolling-checksum (literally, the same algorithm) to determine file-differences, and then only backs up the differences - somewhat like rsnapshot.
Unlike rsnapshot however, bup then applies deduplication of the chunks produced this way using SHA1 hashes, and stores the results in the git-packfile format.
This is both very fast (rsnapshot, conversely, is quite slow) and very redundant
- the Git tooling is able to read and understand a bup-repository as just a
Git repository with a specific commit structure (you can run
gitk -a in a
.bup directory to inspect it).
why its space efficient
bup's archive and rolling-checksum format mean it is very space efficient. bup can correctly deduplicate data that undergoes insertions, deletions, shifts and copies. bup deduplicates across your entire backup set, meaning the same file uploaded 50 times is only stored once - in fact it will only be transferred across the network once.
For comparison I recently moved 180 gb of ZFS snapshots of the same dataset undergoing various daily changes into a bup archive, and successfully compacted it down to 50 gb. I suspect I could have gotten it smaller if I'd unpacked some of the archive files that have been created in that backup set.
That is a dataset which is already deduplicated via copy-on-write semantics (it was not using ZFS deduplication because you should basically never use ZFS deduplication).
why its fast
Git is well known as being bad at handling large binary files - it was designed
to handle patches of source code, and makes assumptions to that effect.
steps around this problem because it only used the Git packfile and index
format to store data: where Git is slow, bup implements its own packfile writers
index readers to make looking up data in Git structures fast.
bup also uses some other tricks to do this: it will combine indexes into
files to speed up lookups, and builds bloom filters to add data (a bloom filter is a
fast data structure based on hashes which tells you something is either
‘probably in the data set’ or definitely not).
using bup for Windows backups
bup is a Unix/Linux oriented tool, but in practice I've applied it most usefully at the moment to some Windows servers.
Running bup under cygwin on Windows, and is far superior to the built in Windows backup system for file-based backups. It's best to combine it with the vscsc tool which allows using 1-time snapshots to save the backup and avoid inconsistent state.
If you want to use this script on Cygwin then you need to install the
This script is reasonably complicated but it is designed to be robust against failures in a sensible way - and if we somehow fail running bup, to fallback to making tar archives - giving us an opportunity to fix a broken backup set.
This script will work for backing up to your own remote server today. But, it was developed to work around limitations which can be fixed - and which I have fixed - and so the bup of tomorrow will not have them.
towards the perfect backup
The script above was developed for a client, and the rsync-first stage was designed to ensure that the most recent backup would always be directly readable from a Windows Samba share and not require using the command line.
It was also designed to work around a flaw with bup's indexing step which makes
it difficult to use with variable paths as produced by the
vscsc tool in cygwin.
Although bup will work just fine, it will insist on trying to hash the entire
backup set every time - which is slow. This can be worked around by symlinking
the backup path in cygwin beforehand, but since we needed a readable backup
set it was as quick to use rsync in this instance.
But it doesn‘t have to be this way. I’ve submitted several patches against bup which are also available in my personal development repository of bup on GitHub.
The indexing problem is fixed via
index-grafts: modifying the bup-index to
support representing the logical structure as it is intended to be in the bup
repository, rather then the literal disk path structure. This allows the index
to work as intended without any games on the filesystem, hashing only modified
or updated files.
The need for a directly accessible version of the backup is solved via a few
other patches. We can modify the bup virtual-filesystems layer to support a
dynamic view of the bup repository fairly easily, and add WebDAV support to
the bup-web command (the
With these changes, a bup repository can now be directly mounted as a Windows
mapped network drive via explorers web client, and files opened and copied
directly from the share. Any version of a backup set is then trivially
accessible and importantly we can simply start
bup-web as a cygwin service
and leave it running.
Hopefully these patches will be incorporated into mainline bup soon (they are awaiting review).
so should I use it?
Even with the things I've had to fix, the answer is absolutely. bup is by far the best backup tool I've encountered lately. For a basic Linux system it will work great, for manual backups it will work great, and with a little scripting it will work great for automatic backups under Windows and Linux.
The brave can try out the cutting-edge branch on my GitHub account to test out the fixes in this blog-post, and if you do then posting about them to [firstname.lastname@example.org[(https://groups.google.com/forum/#!forum/bup-list) with any problems or successes or code reviews would help a lot.
This is a quick note on something I encountered while trying to work out why my Realtek NICs are so finicky about connecting and staying connected at gigabit speeds when running Linux.
The current hypothesis is that the
r8168 driver isn't helping very much. So I uninstalled it - and ran into two problems.
In a passing comment it was suggested to me that it would be really great if the home fileserver offered some type of web-interface to find things. We‘ve been aggregating downloaded files there for a while, and there’s been attempts made at categorization but this all really falls apart when you wonder “what does ‘productivity’ mean? And does this go under ‘Linux’ or some other thing?”
Since lately I‘ve been wanting to get desktop search working on my actual desktops, via Gnome’s Tracker project and it's tie-in to Nautilus and Nemo (possibly the subject of a future blog), it seemed logical to run it on the fileserver as an indexer for our shared directories - and then to tie some kind of web ui to that.
Unfortunately, Tracker is very desktop orientated - there's no easy daemon mode for running it on a headless system out-of-the-box, but with a little tweaking you can make it work for you quite easily.
On my system I keep Tracker running as it's own user under a system account. On Ubuntu you need to create this like so (using a root shell -
$ adduser --system --shell=/bin/false --disabled-login --home=/var/lib/tracker tracker $ adduser tracker root
Since tracker uses GSettings for it's configuration these days, you need to su into the user you just created to actually configure the directories which should be indexed. Since this is a server, you probably just have a list of them so set it somewhat like the example below. Note: you must run the dbus-launch commands in order to have a viable session bus for dconf to work with. This will also be a requirement of Tracker later on.
$ su --shell /bin/bash $ eval `dbus-launch --sh-syntax` $ dconf write org/freedesktop/tracker/miner/files/index-recursive-directories "['/path/to/my/dir/1', '/path/to/my/dir/2', '/etc/etc']" $ kill $DBUS_SESSION_BUS_PID $ exit
Your Tracker user is now ready at this point. To start and stop the service, we use an Upstart script like the one below:
description "gnome tracker system startup script" author "wrouesnel" start on (local-filesystems and net-device-up) stop on shutdown respawn respawn limit 5 60 setuid tracker script chdir /var/lib/tracker eval `dbus-launch --sh-syntax` echo $DBUS_SESSION_BUS_PID > .tracker-sessionbus.pid echo $DBUS_SESSION_BUS_ADDRESS > .tracker-sessionbus /usr/lib/tracker/tracker-store end script post-start script chdir /var/lib/tracker while [ ! -e .tracker-sessionbus ]; do sleep 1; done DBUS_SESSION_BUS_ADDRESS=$(cat .tracker-sessionbus) /usr/lib/tracker/tracker-miner-fs & end script post-stop script # We need to kill off the DBUS session here chdir /var/lib/tracker kill $(cat .tracker-sessionbus.pid) rm .tracker-sessionbus.pid rm .tracker-sessionbus end script
Some things to focus on about the script: we launch and save the DBus session parameters. We'll need these to reconnect to the session to run tracker related commands. The post-stop stanza is to kill off the DBus session.
You do need to explicitely launch
tracker-miner-fs in order for file indexing to work, but you don't need to kill it explicitely - it will be automatically shutdown when Upstart kills
Also note that since tracker runs as the user
tracker it can only index files and directories which it is allowed to traverse, so check your permissions.
You can now start Tracker as your user with
start tracker. And stop it with
stop tracker. Simple and clean.
My plan for this setup is to throw together a Node.js app on my server that will forward queries to the tracker command line client - that app will be a future post when it's done.
In a stitch of irony given my prior articles wrestling with a decent IDLE daemon for use with getmail, I'm faced with a new problem in figuring out the best way to migrate all my existing, locally hosted email to Gmail.
This is evidently not an uncommon problem for people, presumably for largely the same reasons I‘m facing: although I like having everything locally on my own server, it only works in places where (1) I live in the same place as the server and (2) where my server won’t be double-NAT'd so dynamic DNS can actually reach it.
My personal email has been hosted on a Dovecot IMAP server in a Maildir up till now. Our tool of choice for this migration will be the venerable OfflineIMAP utility, available on Debian-ish systems with
apt-get install offlineimap.
(2013-10-15) And like that I‘ve broken it again. Fixing the crash on IMAP disconnect actually broke IMAP disconnect handling. The problem here is that IMAPClient’s exceptions are not documented at all, so a time-based thing like IDLE requires some guessing as to what IMAPClient will handle and what you need to handle. This would all be fine if there was a way to get Gmail to boot my client after 30 seconds so I could test it easily.
I've amended the code so that anytime the code would call
_imaplogin() it explicitely dumps the IMAPClient object after trying to log it out, and recreates it. Near as I can tell this seems to be the safe way to do it, since the IMAPClient object does open a socket connection when created, and doesn't necessarily re-open if you simply re-issue the login command.
There's an ongoing lesson here that doing anything that needs to stay up with protocol like IMAP is an incredible pain.
(2013-10-14) So after 4 days of continuous usage I‘m happy with this script. The most important thing it does is crash properly when it encounters a bug. I’ve tweaked the Gist a few times in response (a typo meant imaplogin didn't recover gracefully) and added a call to
notify_mail on exit which should've been there to start with.
It‘s also becoming abundantly clear that I’m way to click-happy with publishing things to this blog, so some type of interface to show my revisions is probably in the future (a long with a style overhaul).
My previous attempt at a GetMail IDLE client was a huge disappointment, since imaplib2 seems to be buggy for handling long-running processes. It‘s possible some magic in hard terminating the IMAP session after each IDLE termination is necessary, but it raises the question of why the idle() function in the library doesn’t immediately exit when this happens - to me that implies I could still end up with a zombie daemon that doesn't retreive any mail.
Thus a new project - this time based on the Python
imapclient library. imapclient uses imaplib behind the scenes, and seems to enjoy a little bit more use then
imaplib2 so it seemed a good candidate.
Although the script in this article works, I'm having some problems with it after long-running sessions. The symptom seems to be that imaplib2 just stops processing IDLE session responses - it terminates and recreates them just fine, but no new mail is ever detected and thus getmail is never triggered. With 12 or so hours of usage out of the script, this seems odd as hell and probably like an imaplib2 bug.
With the amount of sunk time on this, I‘m tempted to go in 1 of 2 directions: re-tool the script to simply invoke getmail’s IDLE functionality, and basically remove imaplib2 from the equation, or to write my own functions to read IMAP and use the IDLE command.
Currently I‘m going with option 3: turn on imaplib’s debugging to max, and see if I can spot the bug - but at the moment I can‘t really recommend this particular approach to anyone since it’s just not reliable enough - though it does somewhat belie the fact that Python really doesn't have a good IMAP IDLE library.
I frequently find myself writing upstart scripts which checkout ok, but for some reason don't get detected by the upstart daemon in the init directory, so when I run
start myscript I get
unknown job back. Some experimentation seems to indicate that the problem is I used gedit over GVFS SFTP to author a lot of these scripts.
For something like
myscript.conf, I find the following fixes this problem:
mv myscript.conf myscript.conf.d mv myscript.conf.d myscript.conf
And then hey presto, the script works perfectly.
Along the same lines, the
init-checkconf utility isn‘t mentioned enough for upstart debugging - my last post shows I clearly didn’t know about it. Using it is simple:
$ init-checkconf /etc/init/myscript.conf
Note it needs to be run as a regular user. I'm often logged in as root, so sudo suffices:
$ sudo -u nobody init-checkconf /etc/init/myscript.conf
How to setup and use Wintersmith is covered pretty thoroughly elsewhere on the net, (namely the wintersmith homepage.
Instead I'll cover a few tweaks I had to do to get it running the way I wanted. To avoid being truly confusing, all the paths referenced here are relative to the site you create by running
wintersmith new <your site dir here>