Month: April 2016

Reactive Microservices

I came across CQRS and attended a talk by James Roper on Lagom for micro services called “Rethinking REST” this week. The idea put forth in the talk was that REST services being synchronous are not ideal for microservices. Microservices should not be blocking , so something that emphasizes this async aspect would be preferable to REST.

How is this notion of async services different from a pub-sub model ? One way it goes beyond that is by proposing polyglot persistence. Different microservices should use their own persistence layer that is optimal – relational, nosql, timeseries, event log.

Lagom architecture is based on the book Reactive Services Architecture and that also suggests CQRS. The book proposes service isolation and that composition of systems with microservices should be done asynchronously via message passing.

A quote from the book on why bulkheads failed to save the Titanic:

“The Titanic did use bulkheads, but the walls that were supposed to isolate the compartments did not reach all the way up to the ceiling. So when 6 out of its 16 compartments were ripped open by the iceberg, the ship started to tilt and water spilled over from one compartment to the next, until all of the compartments were filled with water and the Titanic sank, killing 1500 people.”

The suggestion is that for higher availability there should be stricter isolation. The individual (micro)services may fail but the overall system should not be affected. Looking at it this way I think requires one to examine more closely the system design and its invariants.

Take CQRS as an example. In a query system, data is being read and not being modified; a query is a search operation on accumulated data – potentially a very large set with speed (availability) demands. Whereas in a command system, there is more real-time,  perhaps collaborative aspect which leaves most of the data untouched, but creates some new data which needs to be recorded (upload this image, send this message). Why should these two very different operations be served by the same backend ?

One of the systems that has done separation of concerns remarkably well is  AWS. Let’s take the separation of S3 and the database (say Dynamodb). Neither filesystem vendors nor database vendors came up with simple services to solve the problem of an app trying to upload an object and update a database that the object has been uploaded. The failure modes of each service are exposed to the client and the client is not forced to upload files to S3 via a server (bottleneck). S3 offers a read-write consistency for new uploads and eventually consistency for updates. Here the storage object, its universally addressable name, and its properties (backup, encryption, access, versioning, retention policies) are an invariant -a client does not need to fiddle with them apart from being assured of a certain level of service. One can call it an ingestion system.  More on that here.

The takeaway is that microservices should do one thing and do it very well, in a highly available, non-blocking manner. REST services can certainly be non-blocking, but they can also be blocking as described here, which is a problem.

A cornerstone of asynchronous services is messaging. Here’s a talk on Riak, a masterless database and messaging system, running on CloudFoundry, at GE – https://youtu.be/aYaz17qf7cE?t=2161 . It is followed by talks on microservices.

Nvidia nvGraph and Tesla P100 GPU

“The latest version of NVIDIA’s parallel computing platform gives developers direct access to powerful new Pascal features, including unified memory and NVLink. Also included in this release is a new graph analytics library — nvGRAPH — which can be used for robotic path planning, cyber security and logistics analysis, expanding the application of GPU acceleration into the realm of big data analytics.”

nvGRAPH supports three widely-used algorithms:

Page Rank is most famously used in search engines, and also used in social network analysis, recommendation systems, and for novel uses in natural science when studying the relationship between proteins and in ecological networks.

Single Source Shortest Path is used to identify the fastest path from A to B through a road network, and can also be used for a optimizing a wide range of other logistics problems.

Single Source Widest Path is used in domains like IP traffic routing and traffic-sensitive path planning.

In addition, the nvGRAPH semiring Sparse Matrix Vector Multiplication (SPMV) operations can be used to build a wide range of innovative graph traversal algorithms.

A paper on how to represent cyber attacks as graphs – http://csis.gmu.edu/noel/pubs/2015_IEEE_HST.pdf references the CAPEC, which is a collection of vulnerabilities for such a graph study.

A graphdb represents data using nodes, edges and properties.

graphdatabase_propertygraph

It allows querying with a graph traversal language such as Gremlin. Blazegraph offers a GPU accelerated graphdb. Here’s a graph of machine learning papers from Research Front Maps.

rfmap1

MapD, a GPU accelerated db recently got funding from Nvidia and others.

Elliptic Curve Crypto

A brief review of concepts leading to ECC:

Algebraic Group: A set G with a single binary operator (+) that maps any 2 elements of G to another element of G.  If the + operator is commutative, G is called an Abelian group or a Commutative group.

Algebraic Ring: A set R with a two binary operators (+) and (*) that map any 2 elements of G to another element of R. Every Ring is an Abelian group with respect to (+). If it is Abelian with respect to (*) then it is called a Commutative Ring.

Algebraic Field: A set F with a two binary operators (+) and (*) that map any 2 elements of F to another element of F; where (+) and (*) are both commutative; and where each element of F has an inverse with respect to (*), which is also in F.

Polynomial: A construct consisting of repeated applications of (+) and (*) on elements of a Field (which by definition of a Field returns another element of the Field )

Closed Field: If the roots of every polynomial belong to the Field, it is a Closed Field. The set of Complex numbers form a Closed Field, the set of Integers do not.

Algebraic Finite Field: A Field with finite number of elements. It is not a Closed Field, but satisfies other properties of a Field. The number of elements of the Finite Field is called its order. An example is the modulus field, obtained by modulo division by a prime number P. Note that exponentiation on a finite field generates a distribution of all the numbers in the sequence less than the order – for large values of an exponent it produces an highly irregular distribution. This makes the inverse operation of discrete logarithm a hard problem.

Galois Field: Same as above, a finite Field.

Elliptic-Curve: Arises in calculation of arclength of an ellipse and naturally has a group property, where two elements on the curve map to a third element also on the curve. See here. The EC has an interesting property that allows definition of another set of maps from any two points on the field to a third: EC-additon and EC-scalar-multiplication. This is the group operation on Elliptic curves, a reasonable explanation of why it exists is here. These two operations are defined in more detail- http://www.coindesk.com/math-behind-bitcoin/ . Lets observe that they are distinct from, but dependent on the underlying field properties.

Elliptic-Curve over a Field: The set of points (tuples) belonging to a field F that satisfy an elliptic curve equation (y*y = x*x*x+a*x+b) with the (+) and (*) defined according to the field.  The name elliptic curve arises because this equation comes up when calculating the arclength of the ellipse.

Elliptic-Curve over a Finite Field: Tuples drawn from a Finite Field F, that satisfy the elliptic curve equation (y^2 = x^3+ax+b).

Elliptic-Curve Crypto: Built on the hardness of solving the discrete logarithm problem: Given an elliptic curve E defined over a finite field Fq of order(=size) n, a point P ∈ E(Fq ), and a point Q ∈ E(Fq ), find the integer k ∈ [0,n −1] such that Q = k P. The integer k is called the discrete logarithm of Q to the base P, denoted by k = logP Q. This is asking for an inverse operation

In the Bitcoin implementation, the finite field is a prime modulus field Zp and the EC equation is y^2 = x^3+7.  The “curve” on the finite field is highly discontinuous and an example is shown here.

The cryptol verification software has been used to find bugs in the implementation of a Elliptic Curve Crypto implementations. Implementation errors can dramatically impact security properties of crypto that appears mathematically solid.

S-boxes

In several crypto systems, there are P-boxes and S-boxes. P-boxes are permutations of input to outputs. What are S-boxes ? I’ve known that they are static maps, used for “confusion” of input to output, have something to do with Galois fields, and that small changes in them affect the crypto properties of a cryptosystem; so they are analyzed by cryptographers in great detail.  Their design is a bit of a black box. Even in cryptol documentation, they are described as a kind of gift from above.

S-box stands for a substitution box.  The goal of the S-box is that for a small change of input there should be a large change in the output – an avalanche.  This property is clearly not true of pure transposition, substitution, vignere pad ciphers  – where a change of an input letter changes exactly one output letter. Because this property is not true, the simple crypto systems can be broken easily by analyzing output/input pairs.

A P-box can be described by a multiplication of the input (symbols, not sequence) with a matrix which is the identity matrix  with its rows transposed (scrambled) according to the permutation order.  So the P-box transform is an invertible linear transform. Can the S-box be another linear transform ? Suppose it were linear. One could give it unit input vectors and determine the rows of the transform, or use Gauss elimination with known input/output pairs.

So one would want this to be non-linear.  In fact, the AES S-box is a sophisticated permutation map – for each byte another unique byte is mapped in a lookup table . Isn’t a permutation linear ?  The key is that the bits of an output byte are not linearly related to the bits of the input byte. Looking at the Substitution-Permutation Network design makes it clear that this property will break Gauss elimination techniques on input/output pairs .

With that, here’s a reference which discusses the finite field math and the transform f_affine( g_finite_field_inverse( input_x ) ) which comprises the S-box design of AES – http://csrc.nist.gov/archive/aes/rijndael/Rijndael-ammended.pdf . The design goal is explicitly to maximize the non-linearity and beat differential attacks which analyze differences of output with input.

The design considers bits as coefficients of a polynomial (1111 = x^3+x^2+x^1+x^0), bytes as degree 7 polynomials and defines an invertible multiplication of two bytes as a map to another byte. Irreducible polynomials over a field are not factorizable over the field. A polynomial presupposes multiplication and addition of elements (multiplication is distinct from addition, and specifically is not repeated addition – adding a number to itself gives zero). For a modulo(prime) finite group, multiplication can be defined using the same modulo operator that defines the group (In G7, 3*5 = 15%7 = 1). A polynomial representation of a byte is a polynomial of (max) degree 7 with coefficients 0 or 1. The multiplication that maps (byte * byte -> byte) can be defined by taking the product polynomial (a polynomial of max degree 14)  then taking modulus via a irreducible byte polynomial of degree 7 or higher (in analogy with modulo group above) to give back a polynomial of max degree less than 7, i.e. a byte. Note the modulus used in AES is a specific polynomial of degree 8, m(x), which is not itself a byte. The division is done with polynomial long division.

The irreducibility of m(x) of degree 8 implies that the GCD of the product of any two bytes p(x)=a(x).b(x) and m(x) is 1, so m(x) does not cleanly divide any p(x), so the division p(x)/m(x) always has a non-zero remainder for non-zero p(x), hence a non-zero product according to above definition of multiplication. If m(x) were degree 7, then x*m(x)%m(x) = x, so x would map to itself, which is not good. The GCD=1 also implies that each element has a multiplicative inverse. The m(x) chosen is a pentanomial with non-zero terms (8,4,3,1,0) or 100011011 or 0x11B. This design yields modulus math on bytes. So you can raise a byte to the power of n and get back another (well-scrambled) byte.

What a cool secret language. Raising to a power is non-linear. But why choose n=-1 ?

The choice of n=-1 is described in the Rijndael paper here with the reference to Nyberg’94 paper here which carries the line “The author’s attention to the mapping x ->x^-1 was drawn by C. Carlet. He observed that the high nonlinearity property (i) was actually proven in the work of Carlitz and Uchiyama.” The non-linearity  is aimed at defeating linear cryptanalysis where a few input-output pairs can be used to guess the key. The design of the scheme in Nyberg paper is aimed at defeating differential cryptanalysis, by flattening out the variation in the output, so it is close to noise.

Yet, if the letter “s” is common, the corresponding S-box letter (from the lookup table) will also be common. This S-box must be only part of an encryption scheme.

The Nyberg paper has the following line: “However,the high nonlinear order of the inversion mapping and .. comes into effect if these mappings are combined with appropriately chosen linear or affine permutations which may vary from round to round and depend on the secret key.”  A counter paper is here –On Exact Algebraic [Non-]Immunity of S-boxes Based on Power Function

A clear explanation of nonlinearity of a function F as the minimum hamming distance between bytes reachable by linear functions and bytes reachable by the non-linear function F is found in The Design Of S-Boxes by Jennifer Miuling Cheung (2010) p.13.  For two input bits, there are 8 linear functions (1, x1, x2, x1+1, x2+1, x1+x2, x1+x2+1) and 4 input sequences of 00,01,10,11; the output of each of the 8 functions on the 4 input sequences gives 8 output vectors of length 4. There are 16 vectors of length 4 of which linear functions reach only 8, the remaining 8 are nonlinear.