DevOps Zone is brought to you in partnership with:

Spike Morelli has over a decade of experience as an engineer and is now a devops consultant and proud startup owner. After years focused on technical challenges like automation, monitoring, scalability and cloud, Spike took an unexpected turn and while still in engineering he started working with people rather than machines, coaching engineers and helping teams going from good to great. Spike is a DZone MVB and is not an employee of DZone and has posted 10 posts at DZone. You can read more from them at their website. View Full User Profile

Transferring Large Amount of Data Over the Network: SCP, TAR; SSH, TAR; NC Compared

05.23.2012
| 17705 views |
  • submit to reddit

Scp is slow, that’s a known fact. Known and so annoying that someone tried to fix it by producing the hpn-ssh patch:

SCP and the underlying SSH2 protocol implementation in OpenSSH is network performance limited by statically defined internal flow control buffers. These buffers often end up acting as a bottleneck for network throughput of SCP, especially on long and high bandwidth network links.


Nonetheless, especially for small transfers, scp is straightforward and so that’s what I use. But transferring 100GB of data between 2 machines on the same LAN proved to be such a pain that I decided to opt for one of the alternatives, the 2 most common being tar over ssh and tar over netcat. The whole thing got me curious so I decided to do some testing/bechmarking.

This is no scientific test. There was background noise, OSes of the box were different, and more. But it’s good enough as a real life test between two boxes on the same LAN.

Test bed

Two boxes, referred to as hostA and hostB from now on, with the same specs:

vendor_id : AuthenticAMD
model name : AMD Sempron(tm) Processor 2800+
cpu MHz : 1600.010
MemTotal : 2009992 kB
SATA disks:  Timing cached reads: 1243.04 MB/sec
Timing buffered disk reads: 57.97 MB/sec
Network : VIA Technologies, Inc. VT6102 [Rhine-II] (rev 78)
Switch : Netgear 10/100 Mbs


Boxes were connected via a 10/100Mbs switch, living on the same LAN/subnet. Given the above setup it’s safe to assume that the network is the bottleneck, with its theoretical 12MB/s peak transfer rate.

Test cases and data set

I’ve created 2 directories, one containing 2000 100KB files , and the other 200 10MB files. All files I’ve been created using dd if=/dev/urandom of=file. These are the commands I’ve compared:

hostA: scp -r dir user@hostB:/tmp/
hostA: tar cf – dir | ssh user@hostB tar xf – -C /tmp/
hostA: tar cf – dir | nc -w1 hostB 6969 \
on hostB: nc -l -p 6969 | tar xf – -C /tmp/


I’ve also run a set of tests using ssh compression and tar gzip compression. To be noted that bzip2 compression is too CPU expensive to be generally worth it.

Results

Command Compression Fileset Time
scp No Small 0:01:53
scp No Large 0:10:10
scp Yes Small 0:02:46
scp Yes Large 0:14:11
tar | ssh No Small 0:00:24
tar | ssh No Large 0:03:18
tar | ssh Yes ssh Small 0:01:09
tar | ssh Yes ssh Large 0:11:33
tar | ssh Yes tar gz Small 0:00:18
tar | ssh Yes tar gz Large 0:01:57
tar | nc No Small 0:00:21
tar | nc No Large 0:03:24
tar | nc Yes tar gz Small 0:00:20
tar | nc Yes tar gz Large 0:01:16

This is a summary with totals for the entire dataset transfer with times in seconds

Command Compression Time
scp No 723
scp Yes ssh 1017
tar | ssh No 222
tar | ssh Yes ssh 762
tar | ssh Yes tar gz 135
tar | nc No 225
tar | nc Yes tar gz 96

Conclusions

Scp is by far the slowest transfer method, 623% slower than the fastest case scenario. Contrary to the common conception that it’s ssh’s encryption layer to slow down the transfer, it is really scp being slow, as tar over ssh performs as good as over nc. The other 2 things to consider are the disastrous impact of ssh’s traffic compression (-C), which surprisingly slows down the transfer of roughly 42% in the case of scp and even 270% in the tar over ssh test, and the tar gzip compression, which results in transfers
being 87% faster over ssh and 134.38% over nc.

Published at DZone with permission of Spike Morelli, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags:

Comments

Hontvári Levente replied on Thu, 2012/05/24 - 6:50pm

I would add that on Windows, WinSCP and Putty (they use the same code for file transfer) are extremely slow compared to OpenSSH on Linux, while transferring a single large file. I measured the difference and the Windows values were about 1/5 or even 1/10 of the speed measured on Ubuntu.

In case of a single large file transfer, between two Ubuntu hosts, one of which was connected to a 100 Mb switch, about 10 ms roundtrip distance, OpenSSH scp was able to saturate the connection.

Endre Varga replied on Fri, 2012/05/25 - 1:58am

Be careful! You cannot really benchmark compression if you use essentially uncompressible data (you use files with random content).

Mladen Girazovski replied on Fri, 2012/05/25 - 5:43am in response to: Hontvári Levente

In case of a single large file transfer, between two Ubuntu hosts, one of which was connected to a 100 Mb switch, about 10 ms roundtrip distance, OpenSSH scp was able to saturate the connection.

I was never able to "saturate" the connection of my Gigabit LAN when transferring huge files (5-15 GiB) with SCP between 2 Ubuntu machines.

So IME SCP is really slow compared to unencrypted NFS for example, the later was actually limited by the network speed.

Bhaskar Karambelkar replied on Fri, 2012/05/25 - 10:53am

Interesting results, never knew scp was so slow.

 

For the Tar + SSH test, I think you can get even faster results if you use SSH Master connection, coz it will avoid the socket setup/teardown associated with each new connection.

Google for 'ssh master connection' for more details.

 

In fact, I think ssh master connection, whould also lead to better results in strightup scp as well.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.