Stabilizing Cryptocurrency

Before I forget it’s name, BaseCoin is a project that attempts to stabilize a cryptocurrency, so it does not have wild fluctuations. 

Regular (Fiat) currencies are actively managed by Federal banks to be stable and are also stabilized by being the default currency for labor negotiations, employment contracts, retirement accounts etc which are slow moving changes.

More on crypto stabilization in this post –!/main/articles/post-09-the-intelligent-cryptocurrency-investor .


Decentralized Identity Based on Blockchain

Sovrin project. Uses a Permissioned blockchain which allows it to do away with mining as an incentive and instead directly build a Distributed Ledger Technology which stores Distributed Identifiers (DIDs) and maps them to claims. Removal of mining frees up resources and increases network throughput. Interesting Key Management aspects, including revocation. Contrasts with Ethereum uPort – which is permissionless and public. Neat design, but will face adoption problem as it is unhitched from bitcoin/ethereum.–digital-identities-in-the-blockchain-era.pdf

DPKI – Distributed PKI. Attempts to do reduce the weakness of a centralized certificate authority as compromising that cert authority affects each of its issued certificates. This concept is built out and used in Sovrin.

Remme. Remember me. An approach to SSL based logins. Modifies SSL.
Used an EmerCoin implementation as mvp and Ethereum blockchain. EmerCoin: . Adoption problem here is change in behavior of each browser and mobile app.

Sidechains. Original proposal was to free up resources for when trust is established, to reuse blockchain technology and to establish a two-way peg between the sidechain and the blockchain.

Coco Framework.

HyperLedger – Linux based framework for developing blockchains software. Provides a DLT and uses Intel SGS extensions. (Intel+Microsoft+Linux foundation). Uses a replicated state machine model with each validating peer independently adding to its chain after reaching consensus on order of txns with other peers using Practical Byzantine Fault Tolerance or Proof of Elapsed Time. . Related –

A comparison link –


SHA-3 Hash Construction

Keccak (or SHA-3, a ‘qualified successor’ to SHA-2) is a hash function based on interesting and novel ideas and claims to be post-quantum sufficient or quantum resistant.

Some Keccak sponge-hash-function pointers:

A sponge function is a generalization of both a) hash functions, which have a fixed output length, and b) stream ciphers (=state cipher), which have a fixed input length.

Random sponge = Random permutation

The spongeconstruction:

First, the input string is padded with a reversible padding rule and cut into blocks of r bits. Then the b bits of the state are initialized to zero and the sponge construction proceeds in two phases:

  • In the absorbing phase, the r-bit input blocks are XORed into the first r bits of the state, interleaved with applications of the function f, a fixed length permutation/transformation function function. When all input blocks are processed, the sponge construction switches to the squeezing phase.
  • In the squeezing phase, the first r bits of the state are returned as output blocks, interleaved with applications of the function f. The number of output blocks is chosen at will by the user.

The last c bits of the state are never directly affected by the input blocks and are never output during the squeezing phase.”

What security does Keccak offer against quantum attacks, specifically Grover’s algorithm?

“SHA-3 was designed to be very efficient in hardware but is relatively slow in software. SHA-3 takes about double the time compared to SHA-2 to run in software and about a quarter of the time to run in hardware.” This makes it less suitable than SHA-2 for key stretching, at least against an attacker that is hardware equipped.

The advantage of SHA-3 is that a computationally-simpler SHA-3(key | data) can suffice as a MAC.”

NVidia Tiny Linux Kernel and TrustZone

The NVidia Tiny Linux Kernel (TLK), is 23K lines of BSD licensed code, which supports multi-threading, IPC and thread scheduling and implements TrustZone features of a Trusted Execution Environment (TEE). It is based on the Little Kernel for embedded devices.

The TEE is an isolated environment that runs in parallel with an operating system, providing security. It is more secure than an OS and offers a higher level of functionality than a SE, using a hybrid approach that utilizes both hardware and software to protect data. Trusted applications running in a TEE have access to the full power of a device’s main processor and memory, while hardware isolation protects these from user installed apps running in a main operating system. Software and cryptographic isolation inside the TEE protect the trusted applications contained within from each other. A paper describing the separation with alternatives for virtualizing the TEE appeared at

TrustZone was developed by Trusted Foundations Software which was acquired by Gemalto. Giesecke & Devrient developed a rival implementation named Mobicore. In April 2012 ARM, Gemalto and Giesecke & Devrient combined their TrustZone portfolios into a joint venture Trustonic, which was the first to qualify a GlobalPlatform-compliant TEE product in 2013.

A comparison with other hardware based security technologies is found here. Whereas a TPM is exclusively for security functions and does not have access to the CPU,  the TEE does have such access.

Attacks against TrustZone on Android are described in this blackhat talk. With a TEE exploit,  “avc_has_perm” can be modified to bypass SELinux for Android. By the way, Access Vectors in SELinux are described in this wonderful link. “avc_has_perm” is a function to check the AccessVectors allows permission.

NVidia Volta GPU vs Google TPU

A Graphics Processing Unit (GPU) allows multiple hardware processors to act in parallel on a single array of data, allowing a divide and conquer approach to large computational tasks such as video frame rendering, image recognition, and various types of mathematical analysis including convolutional neural networks (CNNs). The GPU is typically placed on a larger chip which includes CPU(s) to direct data to the GPUs. This trend is making supercomputing tasks much cheaper than before .

Tesla_v100 is a System on Chip (SoC) which contains the Volta GPU which contains TensorCores, designed specifically for accelerating deep learning, by accelerating the matrix operation D = A*B+C, each input being a 4×4 matrix.  More on Volta at . It is helpful to read the architecture of the previous Pascal P100 chip which contains the GP100 GPU, described here – .  Background on why NVidia builds chips this way (SIMD < SIMT < SMT) is here – .

Volta GV100 GPU = 6 GraphicsProcessingClusters x  7 TextureProcessingCluster/GraphicsProcessingCluster x 2 StreamingMultiprocessor/TextureProcessingCluster x (64 FP32Units +64 INT32Units + 32 FP64Units +8 TensorCoreUnits +4 TextureUnits)

The FP32 cores are referred to as CUDA cores, which means 84×64 = 5376 CUDA cores per Volta GPU. The Tesla V100 which is the first product (SoC) to use the Volta GPU uses only 80 of the 84 SMs, or 80×64=5120 cores. The frequency of the chip is 1.455Ghz. The Fused-Multiply-Add (FMA) instruction does a multiplication and addition in a single instruction (a*b+c), resulting in 2 FP operations per instruction, giving a FLOPS of 1.455*2*5120=14.9 Tera FLOPs due to the CUDA cores alone. The TensorCores do a 3d Multiply-and-Add with 7x4x4+4×4=128 FP ops/cycle, for a total of 1.455*80*8*128 = 120TFLOPS for deep learning apps.

3D matrix multiplication3d_matrix_multiply

The Volta GPU uses a 12nm manufacturing process, down from 16nm for Pascal. For comparison the Jetson TX1 claims 1TFLOPS and the TX2 twice that (or same performance with half the power of TX1). The VOLTA will be available on Azure, AWS and platforms such as Facebook.  Several applications in Amazon. MS Cognitive toolkit will use it.

For comparison, the Google TPU runs at 700Mhz, and is manufactured with a 28nm process. Instead of FP operations, it uses quantization to integers and a systolic array approach to minimize the watts per matrix multiplication, and optimizes for neural network calculations instead of more general GPU operations.  The TPU uses a design based on an array of 256×256 multiply-accumulate (MAC) units, resulting in 92 Tera Integer ops/second.

Given that NVidia is targeting additional use cases such as computer vision and graphics rendering along with neural network use cases, this approach would not make sense.

Miscellaneous conference notes:

Nvidia DGX-1. “Personal Supercomputer” for $69000 was announced. This contains eight Tesla_v100 accelerators connected over NVLink.

Tesla. FHHL, Full Height, Half Length. Inferencing. Volta is ideal for inferencing, not just training. Also for data centers. Power and cooling use 40% of the datacenter.

As AI data floods the data centers, Volta can replace 500 CPUswith 33 GPUs.
Nvidia GPU cloud. Download the container of your choice. First hybrid deep learning cloud network. . Private beta extended to gtc attendees.

Containerization with GPU support. Host has the right NVidia driver. Docker from GPU cloud adapts to the host version. Single docker. Nvidiadocker tool to initialize the drivers.

Moores law comes to an end. Need AI at the edge, far from the data center. Need it to be compact and cheap.

Jetson board had a Tegra SoC chip which has 6cpus and a Pascal GPU.

AWS Device Shadows vs GE Digital Twins. Different focus. Availabaility+connectivity vs operational efficiency. Manufacturing perspective vs operational perspective. Locomotive may  be simulated when disconnected .

DeepInstinct analysed malware data using convolutional neural networks on GPUs, to better detect malware and its variations. – deep learning for time series data to detect anomalous conditions on sensors on the field such as pressure in a gas pipeline.

GANS applications to various problems – will be refined in next few years.

GeForce 960 video card. Older but popular card for gamers, used the Maxwell GPU, which is older than Pascal GPU.

Cooperative Groups in Cuda9. More on Cuda9.

Weeping Angel

Sales of hardware camera blockers and similar devices should increase, with the Weeping Angel disclosure. Wikileaks detailed how the CIA and MI5 hacked Samsung TVs to silently monitor remote communications. Interesting to read for the level of technical detail: . The attack was called ‘Weeping Angel’, a term borrowed from Doctor Who.

Other such schemes are described at , including a iPhone implant to get data from your phone – .

Golang interface ducktype and type assertion

The interface{} type in Golang is like the duck type in python. If it walks like a duck, it’s a duck – so the type is determined by an attribute of the variable. This duck typing support in python often leaves one searching for the actual type of the object that a function takes or returns. But with richer names, or a naming convention, one gets past this drawback. Golang tries to implement a more limited and stricter duck typing – the programmer can get define the type of a variable as an interface{} – which should have been called a ducktype{}. But when it comes time to determine the type of the duck, one must assert it explicitly – and in that process can receive an error indicating if there was a type mismatch.

explicit ducktype creation

var myVariableOfDuckType interface{} = “thisStringSetsMyVarToTypeString”

var mySecondVarOfDuckType interface{} = 123  // sets type of mySecondVar to int

ducktype assertion

getMystring, isOk := myVariableOfDuckType.(string) // isOk is true, assertion to string passed

getMystring, isOk := mySecondVarOfDuckType.(string) // isOk is false, assertion to string  failed

In python, int(123) and int(“123”) both return 123.

In Golang, int(mySecondVarOfDuckType) will not return 123, even though the value actually happens to be an int. It will instead return a “type assertion error”.

cannot convert val (type interface {}) to type int: need type assertion