Tuesday, October 4, 2011

Using the Dreamplug's Crypto Chip

After talking with colleagues regarding this box (there are now several around) I learned that the encryption is pretty slow. But this is without the hardware encryption enabled. So let's see if it can be enabled.

For more information visit:
http://www.newit.co.uk/forum/index.php?topic=2030.0

Reference


Intel(R) Pentium(R) CPU G6950 @ 2.80GHz
$ openssl speed -evp aes128
Doing aes-128-cbc for 3s on 16 size blocks: 12582002 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 64 size blocks: 4295548 aes-128-cbc's in 2.99s
Doing aes-128-cbc for 3s on 256 size blocks: 1121451 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 1024 size blocks: 284735 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 8192 size blocks: 35731 aes-128-cbc's in 3.00s
OpenSSL 0.9.8e-fips-rhel5 01 Jul 2008
built on: Tue Dec  7 12:16:36 EST 2010
options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) blowfish(idx)
compiler: gcc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DKRB5_MIT -I/usr/kerberos/include -DL_ENDIAN -DTERMIO -Wall -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 -mtune=generic -fasynchronous-unwind-tables -Wa,--noexecstack -DOPENSSL_USE_NEW_FUNCTIONS -fno-strict-aliasing -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM
available timing options: TIMES TIMEB HZ=100 [sysconf value]
timing function used: times
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc      67104.01k    91944.84k    95379.22k    97189.55k    97569.45k

On the DreamPlug without Hardware AES


$ openssl speed -evp aes128
Doing aes-128-cbc for 3s on 16 size blocks: 1520029 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 64 size blocks: 451973 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 256 size blocks: 118487 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 1024 size blocks: 29964 aes-128-cbc's in 2.99s
Doing aes-128-cbc for 3s on 8192 size blocks: 3758 aes-128-cbc's in 3.00s
OpenSSL 0.9.8o 01 Jun 2010
built on: Thu Feb 10 21:19:23 UTC 2011
options:bn(64,32) md2(int) rc4(ptr,int) des(idx,risc1,4,long) aes(partial) blowfish(idx)
compiler: gcc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O2 -Wa,--noexecstack -g -Wall
available timing options: TIMES TIMEB HZ=100 [sysconf value]
timing function used: times
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc       8106.82k     9642.09k    10110.89k    10261.92k    10261.85k

HOWTO Configure Hardware AES with OpenSSL

First I had to get some things set up:

#NOTE: Be aware of space, I moved a lot of directories to an attached esata disk for more space and speed. Then symlinked them into place.
sudo aptitude install build-essential
cd /usr/src
wget http://dreamplug.googlecode.com/files/linux-2.6.33.6.tar.bz2
tar -xjf linux-2.6.33.6.tar.bz2
ln -s linux-2.6.33.6 linux
ln -s /usr/src/linux-2.6.33.6 /lib/modules/2.6.33.6/build
ln -s /usr/src/linux-2.6.33.6 /lib/modules/2.6.33.6/source
cd /usr/src/linux
wget http://archlinuxarm.org/mirror/with-linux/2.6.33/2.6.33.6/sheeva-2.6.33.6.config
zcat /proc/config.gz > .config
make uImage && make modules
wget http://download.gna.org/cryptodev-linux/cryptodev-linux-1.0.tar.gz
tar -xzf cryptodev-linux-1.0.tar.gz
cd cryptodev-linux-1.0
make; make install
echo "cryptodev" >> /etc/modules
modprobe cryptodev
wget http://sourceforge.net/projects/ocf-linux/files/ocf-linux/20110530/ocf-linux-20110530.tar.gz/download -O ocf-linux-20110530.tar.gz
tar -xzf ocf-linux-20110530.tar.gz
wget http://www.openssl.org/source/openssl-0.9.8r.tar.gz
tar -xzf openssl-0.9.8r.tar.gz
cd openssl-0.9.8r
patch -p1 < ../ocf-linux-20110530/patches/openssl-0.9.8r.patch
./config shared threads zlib --with-cryptodev --openssldir=/etc/ssl --libdir=/usr/lib --prefix=/usr
make depend; make; make install


Fix the versioning: http://chris.dzombak.name/blog/2010/03/building-openssl-with-symbol-versioning

Results

Then here are the results after with OpenSSL


$ openssl speed -evp aes128
Doing aes-128-cbc for 3s on 16 size blocks: 78428 aes-128-cbc's in 0.14s
Doing aes-128-cbc for 3s on 64 size blocks: 76194 aes-128-cbc's in 0.17s
Doing aes-128-cbc for 3s on 256 size blocks: 63152 aes-128-cbc's in 0.06s
Doing aes-128-cbc for 3s on 1024 size blocks: 39103 aes-128-cbc's in 0.04s
Doing aes-128-cbc for 3s on 2048 size blocks: 23210 aes-128-cbc's in 0.00s
OpenSSL 0.9.8r 8 Feb 2011
built on: Thu Jul 14 09:50:39 MDT 2011
options:bn(64,32) md2(int) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr)
compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DHAVE_CRYPTODEV -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall
available timing options: TIMES TIMEB HZ=100 [sysconf value]
timing function used: times
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   2048 bytes
aes-128-cbc       8963.20k    28684.80k   269448.53k  1001036.80k 47534080.00k


Yeah way way faster. Oh and SSH uses OpenSSL. So be careful. It can cut your connection and you will need the JTAG to reconnect.

2 comments:

rift said...

Hi,
I have folowed what you said about having crypto support, and it seems to work.

But i think my ssh is not using openssh with cryptodev, i got really high CPU load during scp transfert (50% at 4.5mo/s) have you an idea of what can be wrong ?

thanks,
fred

aortizjr said...

Well I think that there must be some additional overhead with SSH even if it uses OpenSSL and the chip for some offload.

I didn't have a chance to compare before and after and I thought that SSH didn't use OpenSSL at all. But it certainly broke when I broke OpenSSL on the first try. Also I noticed decreases in speed tests from the chip as soon as ssh restarted.

What was your CPU load and speed before?

Another thought is that the crypto chip only supports certain ciphers and my test were with AES-128. SSH if I remember correctly does AES for the initial key transfer, then switches to something else. So maybe see what the chip supports and force SSH to use that or switch to ARC4 (or even 3DES since it is algebraic) which has really low CPU usage.