Tales of an IT Nobody

devbox:~$ iptables -A OUTPUT -j DROP

Grepping extremely large files November 28, 2011

So you forgot to set up logrotate on an active log eh? You’ve got a many gigabyte file to weed through and you need to extract a chunk of time from it?

Here’s a quick cheat sheet to help you get by, quickly and sanely.

It’s about byte offsets!

  • Get the byte offset in the file where your time range starts
  • Get the byte offset in the file where your time range ends
  • dd the data out!


  • You should tack on extra bytes to the byte length, because the offset_end number is actually the beginning byte of your boundary log entry
  • Figuring out the boundary is a bit tricky because a log entry -has- to be present in order to match, so if you’re looking for what happened at 20:00 hours on X date, you may have to round up to the date level depending on how busy your log is
  • This is just a trick to extract a chunk of entries to speed up further filtering.

Full example

No Comments on Grepping extremely large files
Categories: linux servers

Worthy of distribution: Cloud analogy November 5, 2011

This post on Beyond Bandwidth seems to summarize some of my feelings about cloud computing – it’s best thought of as an outsourcing task for the most part; Although the benefits of something like an extra DNS server are a bit more than an ‘outsource benefit’; but you get the idea:

Cloudy analogies with a chance of illusion

No Comments on Worthy of distribution: Cloud analogy
Categories: security servers

Is there a hacking campaign against open source? September 26, 2011

Linux.com, kernel.org, mysql(twice this year), wordpress and php have all reported breaches of some sort this year. Is there some sort of campaign against these ‘high profile’ open source projects? It’s starting to feel like it, to me.

The more hands you get in the pot, the more nervous you should get as an administrator. System issues stem from more than password change frequency and difficulty – stale keys and giving access to folks that shouldn’t have access happens.

I also feel isolation or ‘separation of concerns’ is a tactic that is pushed aside in the name of maxing out a system, more often than not this stop gap would save a lot of trouble. Apache’s ability to mitigate concern from last year’s breaches is a good example of isolation, they had a fairly sophisticated break in and the repercussions weren’t as vocal as the ones from this year.

There doesn’t seem to be sufficient coverage of this MySQL hack right now – how sure are we this isn’t a sample set from a compromised browser as opposed to the site?

I hope there will be continued disclosure so everyone can learn something extra to safeguarding themselves.

While it doesn’t feel right to ream MySQL (at all, or at this point of the news) I have some initial thoughts I just can’t shake:

  1. If MySQL was ‘hacked’: Infiltrated earlier this year; you made no extra measures on a wider scope? really?
  2. Why the hell is your web/any cluster accessible without a VPN? It sounds like they’re selling shell access directly to the host/s..

 C’est la vie

No Comments on Is there a hacking campaign against open source?
Categories: mysql security servers

The inherent risks of ‘daemonize’ features in developer tools – Git, Mercurial (hg) September 24, 2011

A handful of tools such as mercurial, git, (soon PHP – which chances are will be it’s own binary) have their own ‘daemonize’ functionality.

Whatever your reasons – if you want to disable these; there’s little to no help in figuring out how… til now…

If you want to disable Mercurial’s hg serve:
Open the file (Your python install path may differ, but this should give you an idea of what to search for)


Find the function ‘create-server’ and add ‘sys.exit()’ in the first line:

How to verify this works:

1. Before patching – run ‘hg serve’ from a mercurial repository. It will report the port number and remain active in console.
2. After patching – ‘hg serve’ from a mercurial repository will simply exit and say nothing.
3. netstat, ps -A ux |grep ‘hg serve’

If you want to disable git’s git daemon:
This one is probably the easiest of the two: find and ‘chmod a-x’ (remove execute permissions) from the ‘git-daemon’ binary on your system – mine is in /usr/libexec/git-core. You can also relocate it somewhere in-accessible.

How to verify this works:

1. Before relocating/removing/chmodding – run ‘git daemon’ – your console will remain active as if it’s listening. (You can try a base dir for a proper daemon setup if you want …)
2. After relocating/removing – run ‘git daemon’, you’ll get an error saying there are insufficient privileges, or in the case of relocating/removing you’ll see “not a git command”.
3. netstat, ps -A ux |grep ‘git daemon’

No Comments on The inherent risks of ‘daemonize’ features in developer tools – Git, Mercurial (hg)
Categories: git hg linux security servers

System admin ‘helper’: Zebra stripe log / console output August 25, 2011

Looking at an ASCII data table can be difficult – so to start a small trip into Perl programming – I tossed together a simple Perl script, with no module requirements – zebra.pl as I call it, and it zebra stripes the output. It adds a nice touch to say vmstat or viewing something like the interrupts on a multicore box. It’s super simple and done in the nature of Aspersa. (Now a part of percona toolkit).

It doesn’t work 100% like I want – I would have liked it to take an $ARGV; to do that it seems like I’d have to create a dependency with a module (something like GetOpts) – so I decided one can simply modify the script to change how many X rows are striped.

You can fetch it here.

No Comments on System admin ‘helper’: Zebra stripe log / console output
Categories: linux purdy servers

On coining terms: Kiloreq, Megareq, Gigareq, Terareq August 9, 2011

I’m inventing these terms. You heard them here first!

Ok so the idea goes like this:
We use kilo(bit/byte, etc) as measurements of rate, and size – even weight (kinda).

I thought it’d be fun to come up with another terminology that’s right in line with the nature of these units of measurement geared toward server load: “R” for request – prexed accordingly: Kiloreq, Megareq, Gigareq, etc.

So for example, if you get 1000 requests a second, you can say “I get 1KR/sec”, if you have 500 request per second, instead of ‘500 rps’, use the standardized “KR” (Kiloreq) suffix: 0.5KR/sec

How many requests did foo Apache server handle this month?
About 3MR. 3 mega reqs. (3MR * 1000KR * 1000R = 3,000,000 requests).

Or how about for the year? (Assuming a flat rate of 300MR over 12 months)
3.6GR. Gigareqs!

Far fetched? Yes. But I plan on using them in my day-to-day language to try and make them stick :)

No Comments on On coining terms: Kiloreq, Megareq, Gigareq, Terareq
Categories: servers

Netflix is run by monkeys! August 4, 2011

An entertaining read for the HA operations for netflix – a good sense of humor and a very cool, hardcore philosophy for testing!


It’s nice to see Netflix stepping up their involvement in the technical community even more; with the Netflix prize and their blog and API feedback – I hope they become even more successful because of these investments.

No Comments on Netflix is run by monkeys!
Categories: netflix servers

MySQL – max_allowed_packet – what is going on?

So there’s enough noise in the MySQL community about what’s covered well here (https://www.facebook.com/note.php?note_id=10150236650815933)

Unfortuantely the bug is private for the time being; in my conversation with others, the general premise seems to be what good does max_allowed_packet really do? 

First off, I’d like to point out what seems to be what I hope is heading for deprecation – otherwise it just feels a bit sloppy; the default max_allowed_packet for the MySQL client is 1GB. (AKA: Maximum).

As the FB  post recognizes, there’s some ambiguity to how this setting is even enforced in the first place, especially when considering a master->slave configuration (Why does replication even have to follow that rule? Maybe replication clients can have a hard-coded packet to the maximum to get over this?)

I’d propose one of the two:

1. Enforce max_allowed_packet at the server – negotiate a loose communication with the client, where the client will obtain the server’s value and take it for it’s own.

2. Better yet, allow it to be set on a per user basis, following #1.

No Comments on MySQL – max_allowed_packet – what is going on?
Categories: mysql servers

Nay say for ext2/ext3, seemingly ext4 for MySQL servers July 19, 2011

 Basically I felt compelled to make a note regarding what filesystem to evaluate when you are performing a MySQL install. There seems to be a lot of reasons NOT to use the ext filesystems, and instead use XFS..

This is a straight out quote from a MySQL at Facebook blog entry:

ext-2 and ext-3 lock a per-inode mutex for the duration of a write. This means that ext-2 and ext-3 do not allow concurrent writes to a file and that can prevent you from getting the write throughput you expect when you stripe a file over several disks with RAID. XFS does not do this which is one of the reasons many people prefer XFS for InnoDB.

More on IO concurrency from  MySQL big name Domas Mituzas

  • O_DIRECT serializes writes to a file on ext2, ext3, jfs, so I got at most 200-250w/s.
  • xfs allows parallel (and out-of-order, if that matters) DIO, so I got 1500-2700w/s (depending on file size – seek time changes.. :) of random I/O without write-behind caching. There are few outstanding bugs that lock this down back to 250w/s

 A patch for ext4 was created, but it doesn’t appear that it made it in; it seems to yield minimal benefit.

And some other performance and risk observations involved with the most wildly used ext3.

If you’re looking to install or upgrade a MySQL server, it may very well be worth the time investment to research the depths of what filesystem you select, since it has just as much to do with the database performance as the MySQL configuration itself! 

No Comments on Nay say for ext2/ext3, seemingly ext4 for MySQL servers
Categories: databases mysql servers

Worthy of distribution: Reset root MySQL password July 18, 2011

Oh snap! Need to reset your mysql root/admin (or any?) MySQL password? Well, you’ll need root and control over MySQLd to some extent, but this is worthy of a rainy-day bookmark indeed: http://mysqlpreacher.com/wordpress/2011/03/recovering-a-mysql-root-password-three-solutions/

No Comments on Worthy of distribution: Reset root MySQL password

/usr/bin/chage – Sending emails when a pasword expires, or is about to June 6, 2011

There’s a lot of scripts out there that do this but they either don’t revolve around /etc/shadow enough or they’re sloppy.

Here’s my spin on a script for nightly cron that will parse /etc/shadow and send out emails based on the per-user values. It’s resistant to garbage dates (99999 ‘expiration’ dates).

Below is my best attempt at making the script ‘cohesive’ in this layout, however you can find the script here as well.

No Comments on /usr/bin/chage – Sending emails when a pasword expires, or is about to
Categories: linux security servers

MySQL 5.5.12 – init script warning May 25, 2011

I’ve just reported a bug regarding the init script that comes in MySQL 5.5’s source distribution .

Basically, if you call the ‘start’ clause of the script twice it will hose the service by allowing multiple instances to run trying to utilize the same resources (pid file, socket and tcp port) – naturally this renders the service that -was- working fine to screech to a halt, mysqladmin shutdown won’t work.. The only way to fix this is to do something like this to get things to normal:

My solution to avoid this for the time being is to put this in the beginning of the ‘start’ case clause in the ‘mysql.server’ script that we’re copying to /etc/init.d:

I chose exit 0; because technically, it’s still a successful command.

No Comments on MySQL 5.5.12 – init script warning
Categories: linux mysql servers

Amazon AWS – The risk of using a cooked AMI May 11, 2011

Straight from the horses mouth; I no longer use this AMI – but the only ones I’ve used are Debian EBS and SLES … Fortunately I already went through authorized_keys on the one I do keep around.

People take AWS services seriously – but the AMI sharing always set off a flag for me. “Community AMI?” – No thanks! (Unfortunately the only choice for people who don’t want to – or do not have the time to make their own AMI they can trust).

Dear AWS Customer,

We are aware that a public Amazon Machine Image (AMI) in the Amazon EC2 US East (Virginia) region includes a public SSH key that could allow the AMI publisher to log in as root. Our records indicate that you have launched instances of this AMI.

AWS Account ID:  [REMOVED]


Instance ID(s)

We are taking steps to remove the affected AMI within the next 24 hours. This will prevent launching new instances of the affected AMI, though existing instances of this AMI will continue to function normally.  For existing instances you may have of this AMI, we recommend that you migrate services to new instances based on a different AMI.

While you are migrating your services to a new instance, we also recommend that you identify and disable unauthorized public SSH keys. To do so, you will need to remove any unrecognized keys from your running instance(s). Note that public SSH keys are not guaranteed to be in the ‘/root/.ssh/authorized_keys’ file. The following command will locate all of the “authorized_keys” files on disk, when run as root:
       find / -name “authorized_keys” -print -exec cat {} \;

This command will generate a list of all known “authorized_keys” files, which you can then individually edit to remove any unrecognized keys from each of the identified files. To ensure that you do not inadvertently remove your authorized keys, we recommend that you initiate two SSH sessions when starting this process for each instance. You should keep the second session open until you have confirmed that all unrecognized / unauthorized keys are removed and that you still have SSH login access to the instance using your authorized key.

If you do not use SSH to connect to your Amazon EC2 instances, we recommend that you check the security groups associated with the above instance(s) to ensure that port 22 inbound is closed to all unknown IPs. This can be done via the AWS Management Console. For detailed instructions, please check the “Using Security Groups” section of the Amazon EC2 User guide:


We hope this information is helpful.

Best regards,

Amazon Web Services Support

This message was produced and distributed by Amazon Web Services LLC, 410 Terry Avenue North, Seattle, Washington 98109-5210

3 Comments on Amazon AWS – The risk of using a cooked AMI
Categories: security servers

Cacti – DNS response time February 25, 2011

When you google for a cacti template for DNS response time, there’s not a whole lot out there, and what is; is pretty out dated or involves too much fidgetry.

This post assumes you’re comfortable with cacti – you should be able to at least initialize a graph and fill one in using datasources for a host. Must also be using linux. BSD has a different pecking order of commands.

This guide shows you how to slap together a quick DNS response data input method that will allow you to setup graphs on a nameserver/domain pair granularity. (Meaning, you can graph the same domain across several NS’s, or vice versa).

So here’s a quick rundown on creating a “data input method” and a “data template” for cacti to utilize for your nameservers.

1. Create a new data input method
   Name: (anything you want)
   Input type: Script/Command
   Input string:

The 1-liner above should get you the msec for given domain (dom) at given nameserver (ns). To test completely, replace the dom and ns lines with something valid:

   1b. Add the two ‘input fields’, ns and dom.
   1c. Add “ResponseTime” as an ‘output field’.

If done correctly, it should look similar to this:

2. Create the data template – Fill out the values to look similar to the screenshot below. Note, you will probably have to hit ‘create’ after selecting the data input method under “data source”. This will detect the “output field” for the “Data Source Item” values.

Here’s what one of mine looks like:

 I’ve omitted the target host/ns from this example image of course :)

6 Comments on Cacti – DNS response time
Categories: cacti linux servers

On: ntp, ntpd. link dump! January 14, 2011

So, in order to quickly have a (debian) machine up and running on ntp, you’re bound to do something like this ‘apt-get install ntp ntpdate’.

The problem is that this installs ‘ntpd’ too. The default configuration is to allow your server to answer to NTP queries from anywhere.

If you want to give the crackdown you’ll be somewhat frustrated with pre 4.6 config options as they’re somewhat nontraditional to what we usually see; without further ado, here’s a simple ‘link dump’ for a configuration guide.

On ntp 4.x? Guess what? Doesn’t work =[ – must be done with iptables.

Here’s the cheatsheet /etc/ntp.conf :

driftfile /var/lib/ntp/ntp.drift
server my.server.address

restrict default ignore
restrict -6 default ignore


restrict my.server.address

This will allow you to poll things, e.g.: ntpq -p; and keep everyone else from sending packets to your box either on purpose or by accident. Note: You -have- to have your ‘servers’ in restrict lines or else it’ll hang on the first poll. (Indicated by ntpq -p )

When ntp isn’t working right, this is what ntpq -p looks like:

 box:/etc# ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================  .INIT.          16 –    –   64    0    0.000    0.000   0.000

Note the 0.000’s in the delay/offset/jitter – it’s also stuck on the sync request at INIT.

A properly functioning ntpq -p should look something like this:

box:/etc# ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================      3 u    3   64    1    1.349  2446.01   0.000

No Comments on On: ntp, ntpd. link dump!

Time to be informed! March 2, 2010

What would you do if you received a legitimate looking email from your hosting company asking you to OPEN an SMTP relay?

That’s apparently a new style of spam (to create more spam !) targeting administrators. I’m sure there’s a handful of ‘admins’ who can get by and would more than happy to oblige their skillz in opening a relay without really thinking about how fricken nuts it sounds …


Time to separate the weak from the weaker, or is it more weak?

No Comments on Time to be informed!
Categories: security servers

Cloud Computing and the 3rd Reich

Co-worker suggested this would be a good one to put up, I agree; and this clip never gets old!

No Comments on Cloud Computing and the 3rd Reich
Categories: rant security servers

Very in-depth explanation of *nix filesystem June 12, 2009

Came across this while reading about build integration for development and thought I’d make a note about it. It’s much more than just a ‘user files go in the /home/’ directory sheet – it’s everything you could imagine regarding why *nix systems are laid out like they are. Link:

No Comments on Very in-depth explanation of *nix filesystem
Categories: linux servers

Tuning apache directory indexes June 11, 2009

Are you a fan of Options +Indexes like I am?

There’s a few tweaks you can apply to this feature to make it behave more like you want.
Throw a gander at the IndexOptions directive documentation for fine-details.

You can place these in a site configuration, or if allowed, in a .htaccess file.

Notable options:

Make sure you pay attention when using the FancyIndexing option since it resets directives before it. e.g:

No Comments on Tuning apache directory indexes
Categories: apache servers

vnstat/vnstati for quick mrtg style bandwidth graphs June 2, 2009

Ever want the nifty bandwidth charts that MRTG produces, but without the overhead of learning and manipulating the RRD (Round Robin Database) stuff?

vnstati has a built in tool that produces some very sexy snaps of interface bandwidth.

Compile and install from source and play around with the options –

For our dev box where I work I’m running 5 and 10min crons on summary, daily and monthly views:

Output – Can you tell what time our backup is copied? =]

No Comments on vnstat/vnstati for quick mrtg style bandwidth graphs
Categories: linux servers