One of the systems our team develops is UI for end-users, where users can view and manage their print related data.

The system is designed as a simple web application, where we make AJAX calls to Spring controllers which delegate the calls to two other systems, no database is present.

One of the requirements on the system was to support about 1000 concurrent users. Since Tomcat has by default 200 threads and the calls to the other systems may take long (fortunately it’s not the case at the moment), we have decided to make use of Servlet 3.0 async. This way, each AJAX call from the browser uses up Tomcat thread only for preparation of a call to other system. The calls are handled by our asynchronous library for communication with SafeQ and asynchronous http client for communication with Payment System which both have own threadpools and fill in the response when they get a reply.

Since we depend so much on other systems’ performance, we wanted to monitor the execution time of the requests for better tracking and debugging of production problems.

There are several endpoints for each system and more can come later, so in order to avoid duplication, we have decided to leverage Spring aspect programming support. We have created an aspect for each business service (one for SafeQ, one for Payment System) and it was time to implement the measurement business logic.

In synchronous scenario, things are pretty simple. Just intercept the call, note start and end time and if it took too long, just log it in a logfile.

@Around("remoteServiceMethod()")
public Object restTimingAspect(ProceedingJoinPoint joinPoint) throws Throwable {
    long start = System.currentTimeMillis();

    Object result = joinPoint.proceed();

    long end = System.currentTimeMillis();

    long executionInMillis = end - start;
    if (executionInMillis > remoteServiceCallDurationThresholdInMillis) {
        LOGGER.warn("Execution of {} took {} ms.", joinPoint, executionInMillis);
    }

    return result;
}

This won’t work in our asynchronous case. The call to joinPoint.proceed(), which will make call to other system returns immediately without waiting for a reply. Reply is processed in a callback provided to one of async communication libraries. So we have to do a bit more.

We know the signature of our business methods. One of the arguments is always callback, which will process the reply.

public void getEntitlement(ListenableFutureCallback callback, String userGuid, Long costCenterId)

If we want to add our monitoring logic in a transparent way, we have to create special callback implementation, which will wrap the original callback and track the total time execution.

class TimingListenableFutureCallback implements ListenableFutureCallback {

    private ListenableFutureCallback delegate;
    private StopWatch timer = new StopWatch();
    private String joinPoint;

    public TimingListenableFutureCallback(ListenableFutureCallback delegate, String joinPoint) {
        this.delegate = delegate;
        this.joinPoint = joinPoint;
        timer.start();
    }
        
    @Override
    public void onSuccess(Object result) {
        logExecution(timer, joinPoint);
        delegate.onSuccess(result);
    }

    @Override
    public void onFailure(Throwable ex) {
        logExecution(timer, joinPoint);
        delegate.onFailure(ex);
    }
}

And then we have to call the target business method with properly wrapped callback argument.

@Around("remoteServiceMethod() && args(callback,..)")
public Object restTimingAspect(ProceedingJoinPoint joinPoint, ListenableFutureCallback callback) throws Throwable {
    String joinPointName = computeJoinPointName(joinPoint);
    
    Object[] wrappedArgs = Arrays.stream(joinPoint.getArgs()).map(arg -> {
        return arg instanceof ListenableFutureCallback ? wrapCallback((ListenableFutureCallback) arg, joinPointName) : arg;
    }).toArray();
    
    LOGGER.trace("Calling remote service operation {}.", joinPointName);
    
    return joinPoint.proceed(wrappedArgs);
}

There is similar implementation for the other async messaging library.

I hope this solution will help you solve similar problems in your applications in an elegant manner 🙂

While choosing the right security layer for messaging via ZeroMQ, there were two main considerations: built-in security CurveZMQ or usage of more conventional TLS backend?

First option is ZeroMQ’s proprietary security protocol based on elliptic curve cryptography and Daniel’s Bernstein NaCl cryptographic library. CurveZMQ utilizes not only Bernstein’s Curve25519 elliptic curve, but also other ciphers, designed by him, as well- f.e. Salsa20 and poly1305. Other NaCl’s ciphers can be found on its web page under Public-key and Secret-key cryptography chapters.

From the beginning this option was really hot candidate to use, as according to papers about performance of Salsa20 and performance of Curve25519, NaCl crypto functions are faster than standard crypto functions. Also the fact that it was already included in ZeroMQ was big plus. Unfortunately NaCl library and its functions were created in years 2008-2010 and aren’t used widely as f.e. AES or SHA are, therefore aren’t matured enough as other libraries, what brings a little bit panic to our mind and ultimately was the cause of not using it.

Second option was the usage of TLS backend- TLSZMQ. This is a demo project of Ian Barber to secure ZeroMQ communication using OpenSSL library. Unfortunately TLS isn’t built into ZeroMQ, so for now this is the only option how to use TLS and ZeroMQ together 🙁

Implementations comparison

To find the best approach we had few requirements to check. Mainly it was the speed of messaging and library and crypto functions maturity, but other requirements such as support and liveliness of implementation, licensing, quality of code also weighted in decision.

From this set of metrics only 2 appeared to be ultimately decisive:

  • Library and crypto functions maturity
  • Speed of messaging

While the maturity of NaCl library was strong argument why not to use CurveZMQ, performance of NaCl mentioned before spoke strongly in favor of it.

To finally resolve which approach to use and determine the total overhead of each implementation I have created performance tests to find out if TLSZMQ could be used in our environment. The test suite for each implementation was designed to be a block of 20 separate tests running for 10 minutes each, to be sure of reliable test outcomes. Even the message sizes were chosen to reflect the size of messages to be really sent in environment.

total persec

At this point, results clearly spoke strongly in favor of CurveZMQ. On the other hand, there was a conflict with maturity of crypto functions used in CurveZMQ and use of single elliptic curve to generate keys. This could cause many problems, if the curve or other functions would be compromised in the future.

At this point I realized that tests were done with OpenSSL’s by default chosen cipher suite RSA-AES256-SHA384 (OpenSSL version 1.0.1p). There is nothing wrong with this cipher suite, but we have to admit that these algorithms were unnecessarily secure (understand with high overhead).

Yes, I know what I just said and that it sounds really bad, but we must also consider the environment of messaging (its encryption) and performance of the system as well. Only after thorough examination we should accommodate ciphers to real requirements and not just use cipher suites with unnecessarily high security (and overhead) to overkill it.

Therefore, tests were rerun with more appropriate (weaker but still strong) cipher suites. The requirement was to get as much close to CurveZMQ performance as possible:

  • elliptic curves used everywhere possible
    • key exchange [ECDH]
    • authentication [ECDSA]

Any EC could be used, not just one.

  • AES doesn’t have to use 256b key as 128b long keys are still considered pretty strong.

From few acceptable cipher suites, this one specific peaked out: ECDSA-AESGCM128-SHA256.

This new finding has shown that new cipher suite increased the messaging performance considerably (sometimes effectively doubling the speed) [see graphs].

From the results one can see, that TLSZMQ isn’t that much inferior to CurveZMQ and due to TLS’s maturity it is a better choice. The final argument is that test results of TLSZMQ were worse, but sufficient in YSoft SafeQ environment, so the choice of implementation was pretty straight-forward.

Truth is, that NaCl and Curve25519 are really fun (and much easier) to play with and I encourage you to at least have a look at them, but in the end TLS backend is more flexible in terms of crypto primitives (ciphers, key generation) and doesn’t bring that much security concerns. And that’s what we are looking for in secured ZeroMQ messaging.