Skip to content

Commit

Permalink
[HEXDEV-596] Encryption of H2O communication channels (h2oai#71)
Browse files Browse the repository at this point in the history
* [HEXDEV-596] Encryption of H2O communication channels
A simple SSL/TLS based authentication/encryption mechanism for
communications between H2O nodes is now supported.

H2O Security
===================

SSL internode security
------------------------------
By default communication between H2O nodes is not encrypted for
performance reasons. H2O currently support SSL/TLS authentication (basic
handshake authentication) and data encryption for internode
communication.

Usage
============

1) Hadoop.

The easiest way to enable SSL while running H2O via h2odriver is to pass
"ssl_config" flag with no arguments. This will tell h2odriver to
automatically generate all the necessary files and distribute them to
all mappers. This distribution might be secure depending on your YARN
configuration.

    hadoop jar h2odriver.jar -nodes 4 -mapperXmx 6g -output
hdfsOutputDirName -ssl_config

The user can also manually generate keystore/truststore and properties
file as described in subsection 3) standalone and run the following
command to use them instead:

    hadoop jar h2odriver.jar -nodes 4 -mapperXmx 6g -output
hdfsOutputDirName -ssl_config ssl.properties

In such case all the files (certificates and properties) have to be
distributed to all the mapper nodes by the user.

2) Spark.

Please check the Sparkling Water documentation for instructions on how
to enable SSL while running on Spark.

3) Standalone/AWS.

In this case the user has to generate the keystores, truststores and
properties file manually.

a) Generate public/private keys and distributed them (see
"Keystore/truststore generation" section for more information).

b) Generate the "ssl.properties" file (for full list of parameters check
the "Configuraion" section):
h2o_ssl_jks_internal=keystore.jks
h2o_ssl_jks_password=password
h2o_ssl_jts=truststore.jks
h2o_ssl_jts_password=password

c) To start an ssl enabled node pass the location to the properties file
using "ssl_config" :

    java -jar h2o.jar -ssl_config ssl.properties

Configuration:
------------------------------
To enable this feature one should set the -ssl_config <path> parameter
while starting an H2O node pointing to a configuration file (key=value
format) containing the following values:

 - h2o_ssl_jks_internal (optional) - a path (absolute or relative) to a
   key-store file used for internal SSL communication
 - h2o_ssl_jks_password (optional) - the password to the internal
   key-store
 - h2o_ssl_jts (optional) - a path (absolute or relative) to a
   trust-store file used for internal SSL communication
 - h2o_ssl_jts_password (optional) - the password to the internal
   trust-store
 - h2o_ssl_protocol (optional) - protocol name used during encrypted
   communication (supported by JVM). Defaults to TSLv1.2.
 - h2o_ssl_enabled_algorithms (optional) - comma separated list of
   enabled cipher algorithms (ones supported by JVM)

Should the first four (jks_internal, jks_password, jts, jts_password)
parameters were missing, Java defaults will be used for all SSL related
parameters. This is highly dependant on the Java version in use and in
some cases might not work, therefore it is advised to set them.

This has to be set for every node in the cluster. Every node needs to
have access to both Java keystore and Java truststore containing
appropriate keys and certificates.

This feature should not be used together with “useUDP” flag as at this
point in time we do not support UDP encryption through DTLS or any other
protocol which might result in unencrypted data transfers.

Keystore/truststore generation
------------------------------
Keystore/truststore creation and distribution are deployment specific
and have to be handled by the end user.

Basic keystore/truststore generation can be done using the keytool
program, which ships with Java, documentation can be found here
https://docs.oracle.com/javase/7/docs/technotes/tools/solaris/keytool.html.
Each node should have a key pair generated, all public keys should be
imported into a single truststore, which should be distributed to all
the nodes.

The simplest (and not recommended) way would be to call:

keytool -genkeypair -keystore h2o-internal.jks -alias h2o-internal

Distributed the h2o-internal.jks file to all the nodes and set it as
both the
keystore and truststore in ssl.config

A more secure way would be to:

1) run the same command on each node:

keytool -genkeypair -keystore h2o-internal.jks -alias h2o-internal

2) extract the certificate on each node:

keytool -export -keystore h2o-internal.jks -alias signFiles -file
node#.cer

3) distribute all of the above certificates to each node and on each
node create a truststore containing all of them (or put all certificates
on one node, import to truststore and distribute that truststore to each
node):

keytool -importcert -file node#.cer -keystore truststore.jks -alias
node#

Performance
===================
Turning on SSL might result in performance overhead (from 10 to 100%)
for settings and algorithms that exchange data between nodes.

Example benchmark on a 5 node cluster (6GB memory per node) working with
a 5.8mln row dataset (580MB):
                Non SSL     SSL
Parsing:        4.908s      5.304s
GLM modelling:  01:39.446   01:49.634
DL modelling:   11:53.54    28:06.738

Caveats and missing pieces
===================
 - this feature CANNOT be used together with the “useUDP” flag. We
   currently do not support DTLS or any other encryption for UDP
 - should you start a mixed cloud of SSL and nonSSL nodes the SSL ones
   will fail to bootstrap while the nonSSL ones will become
unresponsive
 - we do not provide in-memory data encryption. This might spill data to
   disk in unencrypted form should swaps to disk occur. As a workaround
an encrypted drive is advised.
 - we do not support encryption of data saved to disk, should
   appropriate flags be enabled. Similarly to the previous caveat the
user can use an encrypted drive to work around this issue.
 - currently we support only SSL, no SASL

* Code review fixes

* Log tshark output for debug

* Tshark outdir typo fix

* More code review fixes

* Makes keystore work on IBM JDK

* Fix test ssl gradle property check

* Change ssl arg names as per code review result

* Test ssl uppercased

* Do not kill the node on bad TCP payload, ignore it after logging.

* Remove dead method

* Don't kill the node on lack of sentinel in intial TCP - simply close the connection

* Fix SSL encryption test compilation error
  • Loading branch information
mdymczyk authored and mmalohlava committed Sep 28, 2016
1 parent b20615d commit 3a21cda
Show file tree
Hide file tree
Showing 35 changed files with 1,522 additions and 99 deletions.
3 changes: 3 additions & 0 deletions gradle.properties
Original file line number Diff line number Diff line change
Expand Up @@ -29,3 +29,6 @@ org.gradle.jvmargs='-XX:MaxPermSize=384m'

# Used for h2o-bindings generation, to allow us to use an extended h2o.jar
h2oJarfile='build/h2o.jar'

# Run ssl tests
doTestSSL=false
5 changes: 5 additions & 0 deletions gradle/multiNodeTesting.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,11 @@ task testMultiNode(type: Exec) {
environment "BUILD_DIR", project.buildDir

def args = ['bash', './testMultiNode.sh']

if(project.hasProperty('doTestSSL')) {
args << 'ssl'
}

if (project.hasProperty("jacocoCoverage")) {
args << 'jacoco'
}
Expand Down
15 changes: 11 additions & 4 deletions h2o-algos/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,15 @@ dependencies {

apply from: "${rootDir}/gradle/dataCheck.gradle"

task testSSLEncryption(type: Exec) {
dependsOn cpLibs, jar, testJar
if(project.hasProperty('doTestSSL')) {
commandLine 'bash', './testSSL.sh'
} else {
commandLine 'echo', 'SSL tests not enabled'
}
}

// The default 'test' behavior is broken in that it does not grok clusters.
// For H2O, all tests need to be run on a cluster, where each JVM is
// "free-running" - it's stdout/stderr are NOT hooked by another process. If
Expand All @@ -22,10 +31,8 @@ apply from: "${rootDir}/gradle/dataCheck.gradle"
// level) to files - then scrape the files later for test results.
test {
dependsOn ":h2o-core:testJar"
dependsOn smalldataCheck, cpLibs, jar, testJar, testSingleNode, testMultiNode
dependsOn smalldataCheck, cpLibs, jar, testJar, testSingleNode, testMultiNode, testSSLEncryption

// Defeat task 'test' by running no tests.
exclude '**'
}

testMultiNode.shouldRunAfter testSingleNode
}
84 changes: 84 additions & 0 deletions h2o-algos/src/test/java/water/network/SSLEncryptionTest.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
package water.network;

import hex.tree.gbm.GBM;
import hex.tree.gbm.GBMModel;
import org.junit.Assert;
import org.junit.Ignore;
import water.TestUtil;
import water.fvec.Frame;
import water.util.Log;

import java.util.Date;

import static hex.genmodel.utils.DistributionFamily.gaussian;

/**
* This class is used to capture TCP packets while training a model
* The result is then used to check if SSL encryption is working properly
*/
@Ignore
public class SSLEncryptionTest extends TestUtil {

public static void main(String[] args) {
if (args.length == 1) {
testGBMRegressionGaussianSSL(args[0]);
} else {
testGBMRegressionGaussianNonSSL();
}

System.exit(0);
}

public static void testGBMRegressionGaussianNonSSL() {
stall_till_cloudsize(4);
testGBMRegressionGaussian();
}

public static void testGBMRegressionGaussianSSL(String prop) {
stall_till_cloudsize(new String[] {"-internal_security_conf", prop}, 4);
testGBMRegressionGaussian();
}

private static void testGBMRegressionGaussian() {
GBMModel gbm = null;
Frame fr = null, fr2 = null;
try {
Date start = new Date();

fr = parse_test_file("./smalldata/gbm_test/Mfgdata_gaussian_GBM_testing.csv");
GBMModel.GBMParameters parms = new GBMModel.GBMParameters();
parms._train = fr._key;
parms._distribution = gaussian;
parms._response_column = fr._names[1]; // Row in col 0, dependent in col 1, predictor in col 2
parms._ntrees = 1;
parms._max_depth = 1;
parms._min_rows = 1;
parms._nbins = 20;
// Drop ColV2 0 (row), keep 1 (response), keep col 2 (only predictor), drop remaining cols
String[] xcols = parms._ignored_columns = new String[fr.numCols()-2];
xcols[0] = fr._names[0];
System.arraycopy(fr._names,3,xcols,1,fr.numCols()-3);
parms._learn_rate = 1.0f;
parms._score_each_iteration=true;

GBM job = new GBM(parms);
gbm = job.trainModel().get();

Log.info(">>> GBM parsing and training took: " + (new Date().getTime() - start.getTime()) + " ms.");

Assert.assertTrue(job.isStopped()); //HEX-1817

// Done building model; produce a score column with predictions

Date scoringStart = new Date();

fr2 = gbm.score(fr);

Log.info(">>> GBM scoring took: " + (new Date().getTime() - scoringStart.getTime()) + " ms.");
} finally {
if( fr != null ) fr .remove();
if( fr2 != null ) fr2.remove();
if( gbm != null ) gbm.remove();
}
}
}
5 changes: 5 additions & 0 deletions h2o-algos/src/test/resources/ssl.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
h2o_ssl_protocol=TLSv1.2
h2o_ssl_jks_internal=../h2o-core/src/test/resources/keystore.jks
h2o_ssl_jks_password=password
h2o_ssl_jts=../h2o-core/src/test/resources/cacerts.jks
h2o_ssl_jts_password=password
5 changes: 5 additions & 0 deletions h2o-algos/src/test/resources/ssl2.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
h2o_ssl_protocol=TLSv1.2
h2o_ssl_jks_internal=../../h2o-core/src/test/resources/keystore.jks
h2o_ssl_jks_password=password
h2o_ssl_jts=../../h2o-core/src/test/resources/cacerts.jks
h2o_ssl_jts_password=password
5 changes: 5 additions & 0 deletions h2o-algos/src/test/resources/ssl3.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
h2o_ssl_protocol=TLSv1.2
h2o_ssl_jks_internal=../../../h2o-core/src/test/resources/keystore.jks
h2o_ssl_jks_password=password
h2o_ssl_jts=../../../h2o-core/src/test/resources/cacerts.jks
h2o_ssl_jts_password=password
41 changes: 21 additions & 20 deletions h2o-algos/testMultiNode.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
#!/bin/bash
source ../multiNodeUtils.sh

# Argument parsing
if [ "$1" = "jacoco" ]
Expand Down Expand Up @@ -109,21 +110,21 @@ CLUSTER_BASEPORT_2=45000
CLUSTER_BASEPORT_3=46000
CLUSTER_BASEPORT_4=47000
CLUSTER_BASEPORT_5=48000
$JVM water.H2O -name $CLUSTER_NAME.1 -baseport $CLUSTER_BASEPORT_1 -ga_opt_out 1> $OUTDIR/out.1.1 2>&1 & PID_11=$!
$JVM water.H2O -name $CLUSTER_NAME.1 -baseport $CLUSTER_BASEPORT_1 -ga_opt_out 1> $OUTDIR/out.1.2 2>&1 & PID_12=$!
$JVM water.H2O -name $CLUSTER_NAME.1 -baseport $CLUSTER_BASEPORT_1 -ga_opt_out 1> $OUTDIR/out.1.3 2>&1 & PID_13=$!
$JVM water.H2O -name $CLUSTER_NAME.2 -baseport $CLUSTER_BASEPORT_2 -ga_opt_out 1> $OUTDIR/out.2.1 2>&1 & PID_21=$!
$JVM water.H2O -name $CLUSTER_NAME.2 -baseport $CLUSTER_BASEPORT_2 -ga_opt_out 1> $OUTDIR/out.2.2 2>&1 & PID_22=$!
$JVM water.H2O -name $CLUSTER_NAME.2 -baseport $CLUSTER_BASEPORT_2 -ga_opt_out 1> $OUTDIR/out.2.3 2>&1 & PID_23=$!
$JVM water.H2O -name $CLUSTER_NAME.3 -baseport $CLUSTER_BASEPORT_3 -ga_opt_out 1> $OUTDIR/out.3.1 2>&1 & PID_31=$!
$JVM water.H2O -name $CLUSTER_NAME.3 -baseport $CLUSTER_BASEPORT_3 -ga_opt_out 1> $OUTDIR/out.3.2 2>&1 & PID_32=$!
$JVM water.H2O -name $CLUSTER_NAME.3 -baseport $CLUSTER_BASEPORT_3 -ga_opt_out 1> $OUTDIR/out.3.3 2>&1 & PID_33=$!
$JVM water.H2O -name $CLUSTER_NAME.4 -baseport $CLUSTER_BASEPORT_4 -ga_opt_out 1> $OUTDIR/out.4.1 2>&1 & PID_41=$!
$JVM water.H2O -name $CLUSTER_NAME.4 -baseport $CLUSTER_BASEPORT_4 -ga_opt_out 1> $OUTDIR/out.4.2 2>&1 & PID_42=$!
$JVM water.H2O -name $CLUSTER_NAME.4 -baseport $CLUSTER_BASEPORT_4 -ga_opt_out 1> $OUTDIR/out.4.3 2>&1 & PID_43=$!
$JVM water.H2O -name $CLUSTER_NAME.5 -baseport $CLUSTER_BASEPORT_5 -ga_opt_out 1> $OUTDIR/out.5.1 2>&1 & PID_51=$!
$JVM water.H2O -name $CLUSTER_NAME.5 -baseport $CLUSTER_BASEPORT_5 -ga_opt_out 1> $OUTDIR/out.5.2 2>&1 & PID_52=$!
$JVM water.H2O -name $CLUSTER_NAME.5 -baseport $CLUSTER_BASEPORT_5 -ga_opt_out 1> $OUTDIR/out.5.3 2>&1 & PID_53=$!
$JVM water.H2O -name $CLUSTER_NAME.1 -baseport $CLUSTER_BASEPORT_1 -ga_opt_out $SSL 1> $OUTDIR/out.1.1 2>&1 & PID_11=$!
$JVM water.H2O -name $CLUSTER_NAME.1 -baseport $CLUSTER_BASEPORT_1 -ga_opt_out $SSL 1> $OUTDIR/out.1.2 2>&1 & PID_12=$!
$JVM water.H2O -name $CLUSTER_NAME.1 -baseport $CLUSTER_BASEPORT_1 -ga_opt_out $SSL 1> $OUTDIR/out.1.3 2>&1 & PID_13=$!
$JVM water.H2O -name $CLUSTER_NAME.2 -baseport $CLUSTER_BASEPORT_2 -ga_opt_out $SSL 1> $OUTDIR/out.2.1 2>&1 & PID_21=$!
$JVM water.H2O -name $CLUSTER_NAME.2 -baseport $CLUSTER_BASEPORT_2 -ga_opt_out $SSL 1> $OUTDIR/out.2.2 2>&1 & PID_22=$!
$JVM water.H2O -name $CLUSTER_NAME.2 -baseport $CLUSTER_BASEPORT_2 -ga_opt_out $SSL 1> $OUTDIR/out.2.3 2>&1 & PID_23=$!
$JVM water.H2O -name $CLUSTER_NAME.3 -baseport $CLUSTER_BASEPORT_3 -ga_opt_out $SSL 1> $OUTDIR/out.3.1 2>&1 & PID_31=$!
$JVM water.H2O -name $CLUSTER_NAME.3 -baseport $CLUSTER_BASEPORT_3 -ga_opt_out $SSL 1> $OUTDIR/out.3.2 2>&1 & PID_32=$!
$JVM water.H2O -name $CLUSTER_NAME.3 -baseport $CLUSTER_BASEPORT_3 -ga_opt_out $SSL 1> $OUTDIR/out.3.3 2>&1 & PID_33=$!
$JVM water.H2O -name $CLUSTER_NAME.4 -baseport $CLUSTER_BASEPORT_4 -ga_opt_out $SSL 1> $OUTDIR/out.4.1 2>&1 & PID_41=$!
$JVM water.H2O -name $CLUSTER_NAME.4 -baseport $CLUSTER_BASEPORT_4 -ga_opt_out $SSL 1> $OUTDIR/out.4.2 2>&1 & PID_42=$!
$JVM water.H2O -name $CLUSTER_NAME.4 -baseport $CLUSTER_BASEPORT_4 -ga_opt_out $SSL 1> $OUTDIR/out.4.3 2>&1 & PID_43=$!
$JVM water.H2O -name $CLUSTER_NAME.5 -baseport $CLUSTER_BASEPORT_5 -ga_opt_out $SSL 1> $OUTDIR/out.5.1 2>&1 & PID_51=$!
$JVM water.H2O -name $CLUSTER_NAME.5 -baseport $CLUSTER_BASEPORT_5 -ga_opt_out $SSL 1> $OUTDIR/out.5.2 2>&1 & PID_52=$!
$JVM water.H2O -name $CLUSTER_NAME.5 -baseport $CLUSTER_BASEPORT_5 -ga_opt_out $SSL 1> $OUTDIR/out.5.3 2>&1 & PID_53=$!

# If coverage is being run, then pass a system variable flag so that timeout limits are increased.
if [ $JACOCO_ENABLED = true ]
Expand All @@ -136,11 +137,11 @@ fi
# Launch last driver JVM. All output redir'd at the OS level to sandbox files.
echo Running h2o-algos junit tests...

($JVM -Ddoonly.tests=$DOONLY -Dbuild.id=$BUILD_ID -Dignore.tests=$IGNORE -Djob.name=$JOB_NAME -Dgit.commit=$GIT_COMMIT -Dgit.branch=$GIT_BRANCH -Dai.h2o.name=$CLUSTER_NAME.1 -Dai.h2o.baseport=$CLUSTER_BASEPORT_1 -Dai.h2o.ga_opt_out=yes $JACOCO_FLAG $JUNIT_RUNNER $JUNIT_TESTS_BOOT `cat $OUTDIR/tests.txt | awk 'NR%5==0'` 2>&1 ; echo $? > $OUTDIR/status.1) 1> $OUTDIR/out.1 2>&1 & PID_1=$!
($JVM -Ddoonly.tests=$DOONLY -Dbuild.id=$BUILD_ID -Dignore.tests=$IGNORE -Djob.name=$JOB_NAME -Dgit.commit=$GIT_COMMIT -Dgit.branch=$GIT_BRANCH -Dai.h2o.name=$CLUSTER_NAME.2 -Dai.h2o.baseport=$CLUSTER_BASEPORT_2 -Dai.h2o.ga_opt_out=yes $JACOCO_FLAG $JUNIT_RUNNER $JUNIT_TESTS_BOOT `cat $OUTDIR/tests.txt | awk 'NR%5==1'` 2>&1 ; echo $? > $OUTDIR/status.2) 1> $OUTDIR/out.2 2>&1 & PID_2=$!
($JVM -Ddoonly.tests=$DOONLY -Dbuild.id=$BUILD_ID -Dignore.tests=$IGNORE -Djob.name=$JOB_NAME -Dgit.commit=$GIT_COMMIT -Dgit.branch=$GIT_BRANCH -Dai.h2o.name=$CLUSTER_NAME.3 -Dai.h2o.baseport=$CLUSTER_BASEPORT_3 -Dai.h2o.ga_opt_out=yes $JACOCO_FLAG $JUNIT_RUNNER $JUNIT_TESTS_BOOT `cat $OUTDIR/tests.txt | awk 'NR%5==2'` 2>&1 ; echo $? > $OUTDIR/status.3) 1> $OUTDIR/out.3 2>&1 & PID_3=$!
($JVM -Ddoonly.tests=$DOONLY -Dbuild.id=$BUILD_ID -Dignore.tests=$IGNORE -Djob.name=$JOB_NAME -Dgit.commit=$GIT_COMMIT -Dgit.branch=$GIT_BRANCH -Dai.h2o.name=$CLUSTER_NAME.4 -Dai.h2o.baseport=$CLUSTER_BASEPORT_4 -Dai.h2o.ga_opt_out=yes $JACOCO_FLAG $JUNIT_RUNNER $JUNIT_TESTS_BOOT `cat $OUTDIR/tests.txt | awk 'NR%5==3'` 2>&1 ; echo $? > $OUTDIR/status.4) 1> $OUTDIR/out.4 2>&1 & PID_4=$!
($JVM -Ddoonly.tests=$DOONLY -Dbuild.id=$BUILD_ID -Dignore.tests=$IGNORE -Djob.name=$JOB_NAME -Dgit.commit=$GIT_COMMIT -Dgit.branch=$GIT_BRANCH -Dai.h2o.name=$CLUSTER_NAME.5 -Dai.h2o.baseport=$CLUSTER_BASEPORT_5 -Dai.h2o.ga_opt_out=yes $JACOCO_FLAG $JUNIT_RUNNER $JUNIT_TESTS_BOOT `cat $OUTDIR/tests.txt | awk 'NR%5==4'` 2>&1 ; echo $? > $OUTDIR/status.5) 1> $OUTDIR/out.5 2>&1 & PID_5=$!
($JVM $TEST_SSL -Ddoonly.tests=$DOONLY -Dbuild.id=$BUILD_ID -Dignore.tests=$IGNORE -Djob.name=$JOB_NAME -Dgit.commit=$GIT_COMMIT -Dgit.branch=$GIT_BRANCH -Dai.h2o.name=$CLUSTER_NAME.1 -Dai.h2o.baseport=$CLUSTER_BASEPORT_1 -Dai.h2o.ga_opt_out=yes $JACOCO_FLAG $JUNIT_RUNNER $JUNIT_TESTS_BOOT `cat $OUTDIR/tests.txt | awk 'NR%5==0'` 2>&1 ; echo $? > $OUTDIR/status.1) 1> $OUTDIR/out.1 2>&1 & PID_1=$!
($JVM $TEST_SSL -Ddoonly.tests=$DOONLY -Dbuild.id=$BUILD_ID -Dignore.tests=$IGNORE -Djob.name=$JOB_NAME -Dgit.commit=$GIT_COMMIT -Dgit.branch=$GIT_BRANCH -Dai.h2o.name=$CLUSTER_NAME.2 -Dai.h2o.baseport=$CLUSTER_BASEPORT_2 -Dai.h2o.ga_opt_out=yes $JACOCO_FLAG $JUNIT_RUNNER $JUNIT_TESTS_BOOT `cat $OUTDIR/tests.txt | awk 'NR%5==1'` 2>&1 ; echo $? > $OUTDIR/status.2) 1> $OUTDIR/out.2 2>&1 & PID_2=$!
($JVM $TEST_SSL -Ddoonly.tests=$DOONLY -Dbuild.id=$BUILD_ID -Dignore.tests=$IGNORE -Djob.name=$JOB_NAME -Dgit.commit=$GIT_COMMIT -Dgit.branch=$GIT_BRANCH -Dai.h2o.name=$CLUSTER_NAME.3 -Dai.h2o.baseport=$CLUSTER_BASEPORT_3 -Dai.h2o.ga_opt_out=yes $JACOCO_FLAG $JUNIT_RUNNER $JUNIT_TESTS_BOOT `cat $OUTDIR/tests.txt | awk 'NR%5==2'` 2>&1 ; echo $? > $OUTDIR/status.3) 1> $OUTDIR/out.3 2>&1 & PID_3=$!
($JVM $TEST_SSL -Ddoonly.tests=$DOONLY -Dbuild.id=$BUILD_ID -Dignore.tests=$IGNORE -Djob.name=$JOB_NAME -Dgit.commit=$GIT_COMMIT -Dgit.branch=$GIT_BRANCH -Dai.h2o.name=$CLUSTER_NAME.4 -Dai.h2o.baseport=$CLUSTER_BASEPORT_4 -Dai.h2o.ga_opt_out=yes $JACOCO_FLAG $JUNIT_RUNNER $JUNIT_TESTS_BOOT `cat $OUTDIR/tests.txt | awk 'NR%5==3'` 2>&1 ; echo $? > $OUTDIR/status.4) 1> $OUTDIR/out.4 2>&1 & PID_4=$!
($JVM $TEST_SSL -Ddoonly.tests=$DOONLY -Dbuild.id=$BUILD_ID -Dignore.tests=$IGNORE -Djob.name=$JOB_NAME -Dgit.commit=$GIT_COMMIT -Dgit.branch=$GIT_BRANCH -Dai.h2o.name=$CLUSTER_NAME.5 -Dai.h2o.baseport=$CLUSTER_BASEPORT_5 -Dai.h2o.ga_opt_out=yes $JACOCO_FLAG $JUNIT_RUNNER $JUNIT_TESTS_BOOT `cat $OUTDIR/tests.txt | awk 'NR%5==4'` 2>&1 ; echo $? > $OUTDIR/status.5) 1> $OUTDIR/out.5 2>&1 & PID_5=$!

wait ${PID_1} ${PID_2} ${PID_3} ${PID_4} ${PID_5} 1> /dev/null 2>&1
grep EXECUTION $OUTDIR/out.* | sed -e "s/.*TEST \(.*\) EXECUTION TIME: \(.*\) (Wall.*/\2 \1/" | sort -gr | head -n 10 >> $OUTDIR/out.0
Expand Down
137 changes: 137 additions & 0 deletions h2o-algos/testSSL.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
#!/bin/bash

# Clean out any old sandbox, make a new one
OUTDIR=sandbox
rm -fr $OUTDIR; mkdir -p $OUTDIR

# Check for os
SEP=:
case "`uname`" in
CYGWIN* )
SEP=";"
;;
esac

function cleanup () {
kill -9 ${PID_1} ${PID_2} ${PID_3} ${PID_4} 1> /dev/null 2>&1
wait 1> /dev/null 2>&1
}

function countDataCells () {
# Number of tokens we didn't find
COUNT=0
# Number of tokens we looked for
TOTAL=0
FILE=../smalldata/gbm_test/Mfgdata_gaussian_GBM_testing.csv
while IFS= read -r line; do
IFS=',' read -r -a array <<< "$line"
for el in "${array[@]}"; do
# I don't check for "\d+" since things like "1" and "11" can appear as part of SSL encrypted gibberish
# and it's not trivial to distinguish it from actual data
if [[ ! $el =~ \"[0-9]+\" ]]; then
grep -q -- "$el" sandbox/test.out
COUNT=$((COUNT + $?))
TOTAL=$((TOTAL+1))
fi
done
# Because the column names are mostly one letter they might actually appear
# in the encrypted TCP gibberish so we'll skip them but check the actual data
done <<< "$(sed 1d $FILE)"
echo "Found $((TOTAL-COUNT)) tokens from a total of $TOTAL" 1>&2
# Number of tokens we found
echo $((TOTAL-COUNT))
}

function testOutput () {
# Grab the nonSSL data field from the packet body in human readable format
tshark -x -r $OUTDIR/h2o-nonSSL.pcap -T text | awk -F " " '{print $3}' > $OUTDIR/test_tmp.out
# Remove all newlines and spaces for future grep
cat $OUTDIR/test_tmp.out | awk 1 RS='\n' ORS= | awk '{gsub(/ /,"")}1' > $OUTDIR/test.out

# Check that all the data we used as input is in the TCP dump in not encrypted form!
FOUND=$(countDataCells)
if [[ $FOUND -eq 0 ]]; then
echo "Haven't found any of the original data in the nonSSL TCP dump."
echo h2o-algos junit tests FAILED
exit 1
fi

# Grab the SSL data field from the packet body in human readable format
tshark -x -r $OUTDIR/h2o-SSL.pcap -T text | awk -F " " '{print $3}' > $OUTDIR/test_tmp.out
cat $OUTDIR/test_tmp.out | awk 1 RS='\n' ORS= | awk '{gsub(/ /,"")}1' > $OUTDIR/test.out

# Check that none of the data we used as input is in the TCP dump in notencrypted form!
FOUND=$(countDataCells)
if [[ $FOUND -ne 0 ]]; then
echo "Found original data in the SSL TCP dump."
echo h2o-algos junit tests FAILED
exit 1
fi

echo h2o-algos junit tests PASSED
exit 0
}

trap cleanup SIGTERM SIGINT

# Find java command
if [ -z "$TEST_JAVA_HOME" ]; then
# Use default
JAVA_CMD="java"
else
# Use test java home
JAVA_CMD="$TEST_JAVA_HOME/bin/java"
# Increase XMX since JAVA_HOME can point to java6
JAVA6_REGEXP=".*1\.6.*"
if [[ $TEST_JAVA_HOME =~ $JAVA6_REGEXP ]]; then
JAVA_CMD="${JAVA_CMD}"
fi
fi

JVM="nice $JAVA_CMD -ea -Xmx3g -Xms3g -cp build/libs/h2o-algos-test.jar${SEP}build/libs/h2o-algos.jar${SEP}../h2o-core/build/libs/h2o-core-test.jar${SEP}../h2o-core/build/libs/h2o-core.jar${SEP}../h2o-genmodel/build/libs/h2o-genmodel.jar${SEP}../lib/*"
echo "$JVM" > $OUTDIR/jvm_cmd.txt

SSL=""
# Launch 3 helper JVMs. All output redir'd at the OS level to sandbox files.
CLUSTER_NAME=junit_cluster_$$
CLUSTER_BASEPORT=44000
$JVM water.H2O -name $CLUSTER_NAME -baseport $CLUSTER_BASEPORT -ga_opt_out $SSL 1> $OUTDIR/out.1 2>&1 & PID_1=$!
$JVM water.H2O -name $CLUSTER_NAME -baseport $CLUSTER_BASEPORT -ga_opt_out $SSL 1> $OUTDIR/out.2 2>&1 & PID_2=$!
$JVM water.H2O -name $CLUSTER_NAME -baseport $CLUSTER_BASEPORT -ga_opt_out $SSL 1> $OUTDIR/out.3 2>&1 & PID_3=$!

INTERFACE=${TSHARK_INTERFACE:-"eth0"}

echo Running nonSSL test on interface ${INTERFACE}...

pwd

tshark -i ${INTERFACE} -T fields -e data -w ${OUTDIR}/h2o-nonSSL.pcap 1> /dev/null 2>&1 & PID_4=$!

java -Dai.h2o.name=$CLUSTER_NAME -ea \
-cp "build/libs/h2o-algos-test.jar${SEP}build/libs/h2o-algos.jar${SEP}../h2o-core/build/libs/h2o-core.jar${SEP}../h2o-core/build/libs/h2o-core-test.jar${SEP}../h2o-genmodel/build/libs/h2o-genmodel.jar${SEP}../lib/*" \
water.network.SSLEncryptionTest

echo After test cleanup...

cleanup

SSL_CONFIG="src/test/resources/ssl.properties"
SSL="-internal_security_conf "$SSL_CONFIG
CLUSTER_NAME=$CLUSTER_NAME"_2"
$JVM water.H2O -name $CLUSTER_NAME -baseport $CLUSTER_BASEPORT -ga_opt_out $SSL 1> $OUTDIR/out.1 2>&1 & PID_1=$!
$JVM water.H2O -name $CLUSTER_NAME -baseport $CLUSTER_BASEPORT -ga_opt_out $SSL 1> $OUTDIR/out.2 2>&1 & PID_2=$!
$JVM water.H2O -name $CLUSTER_NAME -baseport $CLUSTER_BASEPORT -ga_opt_out $SSL 1> $OUTDIR/out.3 2>&1 & PID_3=$!

echo Running SSL test...

tshark -i ${INTERFACE} -T fields -e data -w ${OUTDIR}/h2o-SSL.pcap 1> /dev/null 2>&1 & PID_4=$!

java -Dai.h2o.name=$CLUSTER_NAME -ea \
-cp "build/libs/h2o-algos-test.jar${SEP}build/libs/h2o-algos.jar${SEP}../h2o-core/build/libs/h2o-core.jar${SEP}../h2o-core/build/libs/h2o-core-test.jar${SEP}../h2o-genmodel/build/libs/h2o-genmodel.jar${SEP}../lib/*" \
water.network.SSLEncryptionTest src/test/resources/ssl.properties

echo After test cleanup...

cleanup

testOutput
3 changes: 2 additions & 1 deletion h2o-bindings/bin/gen_all.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,8 @@
base_port=48000,
xmx="4g",
cp="",
output_dir=results_dir
output_dir=results_dir,
test_ssl=False
)
cloud.start()
cloud.wait_for_cloud_to_be_up()
Expand Down
Loading

0 comments on commit 3a21cda

Please sign in to comment.