The future is already here — it's just not very evenly distributed.
February 12, 2013
I was working on some log processing the other day when I encountered a situation where I wanted to have the python equivilant of Unix’s head and tail commands.
The problem here is that most anywhere you look, Python’s version of tail tends to share the same problems.
- Read the entire file into memory
- Iterate over the entire file
Obviously, this can be quite problematic when you have 300mb logs constantly processing. Here is a more efficent version of tail:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
|
import os def tail(filename, count=1, offset=1024): """ A more efficent way of getting the last few lines of a file. Depending on the length of your lines, you will want to modify offset to get better performance. """ f_size = os.stat(filename).st_size if f_size == 0: return [] with open(filename, 'r') as f: if f_size <= offset: offset = int(f_size / 2) while True: seek_to = min(f_size - offset, 0) f.seek(seek_to) lines = f.readlines() # Empty file if seek_to <= 0 and len(lines) == 0: return [] # count is larger than lines in file if seek_to == 0 and len(lines) < count: return lines # Standard case if len(lines) >= (count + 1): return lines[count * -1:] def head(filename, count=1): """ This one is fairly trivial to implement but it is here for completeness. """ with open(filename, 'r') as f: lines = [f.readline() for line in xrange(1, count+1)] return filter(len, lines) |
This is of course available as a gist as well.
Tags: code, gist, github, python
Permalink |
Posted in Code, Gist, Python |
No Comments »
• • • • •
June 26, 2012
If you have ever used tail to follow multiple files, you will notice that it can be a real pain to read what data is coming from what file. Here is some quick perl I use to make it a bit easier:
|
|
tail -f *error_log | perl -pe "s/==>(.*)?<==/\e[0;31m$&\e[0m/g"; |
You can of course extend this to a full out perl script that processes data as follows:
|
|
#!/usr/bin/perl // Filename: processor.pl while ($line = <STDIN>) { // Process $line however you need // For the above example... $line =~ s/==>(.*)?<==/\e[0;31m$&\e[0m/g; // Now print it out like a boss print $line; } |
Now you can just pipe whatever data you want into the above script:
|
|
tail -f *error_log | perl -p processor.pl |
Tags: linux
Permalink |
Posted in Linux |
No Comments »
• • • • •
June 3, 2012
At my work, we have been using a Git branching strategy based on Vincent Driessen’s successful Git branching model. Over all, the strategy that Vincent proposes is very good and may work perfectly out of the box for many cases. However, since starting to use it I have noticed a few problems as time goes on:
- a LOT of branches - Since Vincent’s model has a branch for each release version, over time you get a lot of branches lying around.
- Redundant - At any given time, we will have 3 identical entities:
- A
release Branch
- A
master Branch (after release-1.0 has been merged in)
- A version Tag (1.0)
- Changes to every build – We use Jenkins CI and we currently have to change each job to point to the new tag or branch. It get’s a little tedious and won’t scale in the long run.
Fortunately, these problems are easy to fix. In fact, the fixes simplify the model and allow simpler automation.
Read the rest of this entry »
Tags: CI, git, process
Permalink |
Posted in Git |
4 Comments »
• • • • •
June 3, 2012

When using Linux every day, it is important to find a distribution that works for you. For me, after more virtual machines than I care to count, I settled on Fedora as my distribution of choice. I really like the constant releases and the latest software. It’s also nice to be able to jump on a RHEL machine and feel at home!
So, I get pretty psyched every 6 or so months when a new version of Fedora released. The latest release is one of the bigger ones they have done and, while I am still playing around with it, seems full of great updates.
Since I use it as my development environment, the nicest change for me day-to-day is the update to Gnome 3.4, a WM that I seem to enjoy much more than most developers apparently.
If you have some time, grab the latest Fedora release and check it out! You won’t be disappointed.
Tags: fedora, linux, release
Permalink |
Posted in Linux |
No Comments »
• • • • •
May 12, 2012
So it seems that, while I don’t have a lot of code that would work as projects, I do have a fair bit of code that works as gists.
I posted a new gist I simply call notify which allows you to give custom messages in the message tray in a programmatic way. I often use it in conjunction with kTimer. I have only tested this script with Gnome 2.x-3.2 but it very well may work in KDE or other desktop environments.
If you find notify useful, please leave a comment!
Tags: code, gist, github, gnome, linux
Permalink |
Posted in Code |
No Comments »
• • • • •
March 27, 2012
Like most developers, I use the terminal… a lot.
Because I like colours and am in the terminal more often than not, I have developed somewhat of an obsession for colourizing all the things I can. I have created two scripts which I find myself using with some consistency. They are alternative, ANSI colour friendly implementations of watch and cat. I have created a gist on github for each of them.
I hope you find them as helpful as I do!
Tags: code, gist, github, linux
Permalink |
Posted in Code, Linux |
No Comments »
• • • • •