Linux Tip: super-fast network file copy

Posted on November 14, 2008 by Jason Striegel.
Categories: Contributors.

If you've ever had to move a huge directory containing many files from one server to another, you may have encountered a situation where the copy rate was significantly less that what you'd expect your network could support. Rsync does a fantastic job of quickly syncing two relatively similar directory structures, but the initial clone can take quite a while, especially as the file count increases.

The problem is that there is a certain amount of per-file overhead when using scp or rsync to copy files from one machine to the other. This is not a problem under most circumstances, but if you are attempting to duplicate tens of thousands of files (think, server or database backup), this per-file overhead can really add up. The solution is to copy the files over in a single stream, which normally means tarring them up on one server, copying the tarball, then untarring on the destination. Unless you are under 50% disk utilization on the source server, this could cause you to run out of space.

Brett Jones has an alternative solution, which uses the handy netcat utility:

After clearing up 10 GBs of log files, we were left with hundreds of thousands of small files that were going to slow us down. We couldn't tarball the file because of a lack of space on the source server. I started searching around and found this nifty tip that takes our encryption and streams all the files as one large file:


This requires netcat on both servers.

Destination box: nc -l -p 2342 | tar -C /target/dir -xzf -
Source box: tar -cz /source/dir | nc Target_Box 2342

This causes the source machine to tar the files up and send them over the netcat pipe, where they are extracted on the destination machine, all with no per-file negotiation or unnecessary disk space used. It's also faster than the usual scp or rsync over scp because there is no encryption overhead. If you are on a local protected network, this will perform much better, even for large single-file copies.

If you are on an unprotected network, however, you may still want your data encrypted in transit. You can perform about the same task over ssh:

Run this on the destination machine:
cd /path/to/extract/to/
ssh user@source.server 'tar -cz -C /source/path/ *' | tar -zxv

This command will issue the tar command across the network on the source machine, causing tar's stdout to be sent back over the network. This is then piped to stdin on the destination machine and the files magically appear in the directory you are currently in.

The ssh route is a little slower than using netcat, due to the encryption overhead, but it's still way faster than scping the files individually. It also has the added advantage of potentially being compatible with Windows servers, provided you have a few of the unix tools like ssh and tar installed on your Windows server (using the cygwin linked binaries that are available).

Fast File Copy - Linux!

SlugPower - Linux controlled power switch

Posted on November 10, 2008 by Jason Striegel.
Categories: Contributors.

linuxpower_20081110.jpg

Phil Endecott has done a bit of hacking with the Linksys NSLU2 "Slug", the low-power network storage device which runs Linux under the hood. His SlugPower project is a switched outlet that can be controlled from the Slug. This enables his print server to power up the printer when it needs to be printing, and automatically cut power to the device when it's not in use.

This page describes the hardware and software design of a printer power switch controlled over USB from my Linksys NSLU2, aka Slug. The unit can, however, be controlled from any Linux box, and can switch anything, not just printers.

My NSLU2 acts mostly as a file and print server. I can go for weeks without printing anything, so I want to keep the printer switched off when I'm not using it (it takes about 4W while idle, which must be more than 99% of its total energy consumption). But it's upstairs, and I don't want to have to go up and down stairs once to switch it on and again to collect my printing. So I decided to get a power switch.

Remote power switches are pretty common in server rooms, but they are costly. This is a pretty affordable way to control the power to any device from anywhere in the world.

SlugPower - A Slug-Controlled Power Switch
Phil Endecott's Slug Projects
NSLU2-Linux

CSSHttpRequest - cross browser AJAX without JSON

Posted on November 2, 2008 by Jason Striegel.
Categories: Contributors, Dynamic HTML.

Because XMLHttpRequest only functions in a same-origin model, the main alternatives have been to either proxy the XML request server-side, or transfer javascript arrays via JSON (since cross-domain script calls are allowed). CSSHttpRequest is another method for performing cross-domain AJAX-style requests, but instead of running loading a remote Javascript file, CSS is used as the transport, and data is encoded inside of urls in @import statements.

A request is invoked using the CSSHttpRequest.get(url, callback) function:
  1. CSSHttpRequest.get(
  2.         "http://www.nb.io/hacks/csshttprequest/hello-world/",
  3.         function(response) { alert(response); }
  4.     );

Data is encoded on the server into URI-encoded 2KB chunks and serialized into CSS @import rules with a modified about: URI scheme. The response is decoded and returned to the callback function as a string:

  1. @import url(about:chr:Hello%20World!);

CSSHttpRequest is open source under an Apache License (Version 2.0).

This is a pretty cool alternative—it seems to be a much safer way to do things than blindly executing javascript from servers not under your control. It's somewhat like what XMLHttpRequest could have offered if it weren't limited by the same-origin policy (though in a more roundabout way).

It still begs the question: why on earth is XMLHttpRequest limited by a same-origin policy, especially when it forces developers to adopt more dangerous methods for cross domain communication?

CSSHttpRequest

Linuxstamp embedded Linux system

Posted on October 15, 2008 by Jason Striegel.
Categories: Contributors.

linuxstamp_20081014.jpg

If the Beagle Board caught your eye, here's another embedded Linux platform that's worth taking a peek at. The Linuxstamp is an ARM powered, ultra-tiny, open hardware Linux system that has a bunch of low-power goodies packed into what appears to be a 3 inch by 4.5 inch footprint.

Compared the the Beagle Board, the Linuxstamp has a bit less processor muscle and lacks video output. To its advantage, it has on-board 10/100 Ethernet, and (I presume) it has lower power requirements, making it a better fit for some embedded needs. Both projects are near the same price point (Linuxstamp: $120, Beagle Board: $150), so you'll be able to make decisions mostly on feature-set when choosing the platform for your next project.

Linuxstamp Project Wiki at Open Circuits [via ladyada]
The Linuxstamp Store

All AJAX image editor

Posted on September 10, 2008 by Jason Striegel.
Categories: Contributors.

drpic_20080909.jpg

Nich sent us a link to his project, Dr. Pic, an all AJAX image editor. Without using any Flash, the application allows you to upload an image, do simple draw and filter operations, place text, crop, resize, and save a finished copy. Javascript is used to draw preview material to the canvas, and then the user's commands are submitted back to a PHP backend which returns a new image to replace the previous version.

It doesn't pretend to be a Photoshop, but in a pinch it could come in handy as a quick tool for resizing or cropping an image. Aside from all that, it's a nice example of how you can leverage some server-side heavy lifting to support functionality that Javascript is lacking.

Dr. Pic - AJAX image editor