Category: software development

Javascript Timing and Meltdown

In response to meltdown/spectre side-channel vulnerabilities, which are based on fine grained observation of the CPU to infer cache state of an adjacent process or VM, a mitigration response by browsers was the reduction of the time resolution of various time apis, especially in javascript.

The authors responded with alternative sources of finding fine grained timing, available to browsers. An interpolation method allows obtaining of a fine resolution of 15 μs, from a timer that is rounded down to multiples of 100 ms.

The javascript  high resolution time api is still widely available and described at https://www.w3.org/TR/hr-time/ with a reference to previous work on cache attacks in Practical cache attacks in JS

A meltdown PoC is at https://github.com/gkaindl/meltdown-poc, to test the timing attack in its own process. The instruction RDTSC returns the Time Stamp Counter (TSC), a 64-bit register that counts the number of cycles since reset, and so has a resolution of 0.5ns on a 2GHz CPU.

int main() {
 unsigned long i;
 i = __rdtsc();
 printf("%lld\n", i);
}
Advertisements

Ethereum Security and the DAO Solidity Attack

The basics of Ethereum are described in the Gavin Wood paper. A list of keywords in Solidity are described in this file from its source, which includes “address”, “contract”, “event”, “mapping” and “wei” ( 1 Eth= 10^18 Wei). This list does not include “gas”, which is a mechanism described in Wood’s paper to combat abuse. Interestingly the paper says “The first example of utilising the proof-of-work as a strong economic signal to secure a currency was by Vishnumurthy et al [2003]”, aka the Karma paper.

The karma paper talks about solving a cryptographic puzzle as enabling one to join the network and be assigned a bank set: “the node certifies that it completed this computation by encrypting challenges provided by its bank-set nodes with its private key. Thus each node is assigned an id beyond its immediate control, and acquires a public-private key pair that can be used in later stages of the protocol without having to rely on a public-key infrastructure”. Crypto puzzles for diverseproblems have been proposed before, a survey and comparison is at https://pdfs.semanticscholar.org/d8b9/a0309cef8c309541876c9c2c5ad5c16c3b7a.pdf

The DAO attack had 3 components, a hacker, a malicious contract and a vulnerable contract. The malicious contract is used to withdraw funds from the vulnerable contract so that it does not get a chance to decrement its balance. Oddly enough the gas mechanism which is supposed to limit computation did not kick in to stop this repeated remittance.

A few weeks before the DAO attack someone had pointed out to me that security of solidity was a bit of an open problem. My feeling was contracts should be layered above the value exchange mechanism, not built into it. Bitcoin based protocols with the simpler OP_RETURN semantics appeared more solid. Later around October’16 at an Ethereum meetup, Fred Ehrsam made the comment that most new projects would be using Ethereum instead of bitcoin. But Bitcoin meetups had more real-world use cases being discussed. The technical limitations exist, which are being addressed by forks such as SegWit2x this November.  Today saw a number of interesting proposals with Ethereum, including Dharma, DataWaller and BloomIDs. Security would be a continuing  concern with the expanding scope of such projects.

Golang interface ducktype and type assertion

The interface{} type in Golang is like the duck type in python. If it walks like a duck, it’s a duck – so the type is determined by an attribute of the variable. This duck typing support in python often leaves one searching for the actual type of the object that a function takes or returns. But with richer names, or a naming convention, one gets past this drawback. Golang tries to implement a more limited and stricter duck typing – the programmer can get define the type of a variable as an interface{} – which should have been called a ducktype{}. But when it comes time to determine the type of the duck, one must assert it explicitly – and in that process can receive an error indicating if there was a type mismatch.

explicit ducktype creation

var myVariableOfDuckType interface{} = “thisStringSetsMyVarToTypeString”

var mySecondVarOfDuckType interface{} = 123  // sets type of mySecondVar to int

ducktype assertion

getMystring, isOk := myVariableOfDuckType.(string) // isOk is true, assertion to string passed

getMystring, isOk := mySecondVarOfDuckType.(string) // isOk is false, assertion to string  failed

In python, int(123) and int(“123”) both return 123.

In Golang, int(mySecondVarOfDuckType) will not return 123, even though the value actually happens to be an int. It will instead return a “type assertion error”.

cannot convert val (type interface {}) to type int: need type assertion

Git Merge. You are in the middle of a merge. Cannot Amend.

Let’s say I made changes to branch “abc”, committed and pushed them.  This fired a build and a code-review after which the code is ready to be merged. But before these changes are merged, another party merged changes from a branch “xyz” into “abc”.

How do I replay my changes on top of changes from “xyz” in “abc” ? Here are the steps, in the root of the directory tree where the changes are located.

$ git pull origin abc // tldr don't do this. try git pull --rebase
Auto-merging file1
CONFLICT (content): Merge conflict in file1
Automatic merge failed; fix conflicts and then commit the result.

At this point, I resolve the commits and attempt a git commit –amend. But this gives an error.  The problem is the pull step itself, which does a fetch+merge and where the merge fails. (check git –help pull).

$ git commit --amend
fatal: You are in the middle of a merge -- cannot amend.
$ git rebase 
First, rewinding head to replay your work on top of it...
Applying: change name
Using index info to reconstruct a base tree...
M file1
Falling back to patching base and 3-way merge...
Auto-merging file1
CONFLICT (content):

To recover from this do, squash the unnecessary merge and do a rebase.

$ git reset --merge 
$ git rebase
First, rewinding head to replay your work on top of it..

This shows a list of conflicting files. Find the conflicting files and edit them to resolve conflicts.

$ git rebase --continue
file: needs merge
You must edit all merge conflicts and then
mark them as resolved using git add

$ git add file1 file2
$ git rebase --continue
Applying: change name
$ git commit --amend
$ git push origin HEAD:refs/for/abc

Here’s the git rebase doc for reference. The rebase command is composed of multiple cherry-pick commands. Cherry-pick is a change-parent operation and changes commit-id = hash(parent-commit-id + changes). Here it re-chains my commit on top of a different commit than the one I started with. The commit –amend would not have changed the commit-id, and so it would not change the parent, so it fails.

Another error sometimes seen during a cherry-pick is the “fatal: You are in the middle of a cherry-pick — cannot amend”.  This happens on a “git commit –amend”  and here it expects you to do a plain “git commit” to first complete the cherry-pick operation.

Some idempotent (safe) and useful git commands.

$ git reflog [--date=iso]  # show history of local commits to index
$ git show  # or git show commit-hash
$ git diff [ --word-diff | --color-diff ]
$ git status # shows files staged for commit (added via git add to the index). different from git diff
$ git branch [-a]
$ git tag -l
$ git describe # most recent tag reachable from current commit
$ git config --list
$ git remote -v
$ git fetch --dry-run 
$ git log --decorate --graph --abbrev-commit --date=relative --author=username
$ git log --decorate --pretty="format:%C(yellow)%h%C(green)%d%Creset %s -> %C(green)%an%C(blue), %C(red)%ar%Creset"
$ git log -g --abrev-commit --pretty=online 'HEAD@{now}'
$ git grep login -- "*yml"
$ tig    # use d, t, g to change views; use / to search
$ git fsck [--unreachable]

Difference between ‘origin’ and ‘upstream’ terms is explained here. An interactive git cheat sheet is here.

Update sequence: working_directory -> index -> commit -> remote

origin = main remote repository

master = default (master/main) branch on the remote repository

HEAD = current revision

git uses the SHA-1 cryptographic hash function for content-addressing individual files (blobs) and directories (trees)  as well as commits – which track changes to the entire tree. Since each tree is a SHA-1 hash over the SHA-1 hashes of the files it contains, the SHA-1 change of a child that is n-levels deep, propagates to the root of the tree.

WebServices Composition with AWS

Some interesting diagrams on composition of a device data processing pipeline with AWS are at –http://aws-de-media.s3.amazonaws.com/images/jmetzner_Hackday_Berlin.pdf

The services listed are:
Amazon Cognito: Identity and Security. Gets token with role for API access by a certain user.
Amazon Kinesis: Massive data ingestion. Uses token auth, but token signing can be a easiest.
AWS Lambda: Serverless Data Compute. Supposed to save on EC2 instance costs ( at the expense of lock-in ).
Amazon S3: Virtually unlimited storage. This is what really makes AWS tick.
Amazon Redshift: Petabyte-scale data analysis
It does not say what data goes to S3 and what data goes to the database.
On Redshift, here’s a comment from Nokia:
http://www.cio.com/article/2860383/data-warehousing/7-amazon-redshift-success-stories.html#slide3” where their volume of data “literally broke the database”, prompting them to look for more scalable solutions.
There is a tension between “servers” and “services”, which goes back to IAAS vs PAAS distinction. PAAS can be faster to develop with reduced focus on server maintenance. However the number of PAAS concepts to deal with is neither small nor particularly inviting, as instead of a single server, one now has to deal with multiple services, each has to be authenticated, priced,  guarded for possible misuse and each has the potential for surprises. A key to simplicity is how composable the services are.

Spark and Scala

Spark is used for distributed data processing for various things – e.g. fraud detection and intrusion detection. It has the notion of Resilient Distributed Datasets. The “resilience” has to do with lineage of a datastructure, not to replication. Lineage means the set of operators applied to the original datastructure. Lineage and metadata are used to recover lost data, in case of node failures, using recomputation.

Spark word count example discussed in today’s meetup.

val textfile = sc.textFile("obama.txt")
val counts = textFile.flatMap(line=>line.split(" ")).filter(_.length>4).map(word=>(word,1)).reduceByKey(_+_)
val sortedCounts = counts.map(_.swap).sortByKey(false)
sortedCounts.take(10)

Scala is a functional programming language which is used in Spark. It prefers immutable datastructures. Sounds great! How are state changes done then ? Through function calls. Recursion has a bigger role to play because it is a way for state changes to happen via function calls. The stack is utilized for the writes, rather than the heap. I recalled seeing a spiral scala program earlier and found one here on the web. Modified it to find the reverse spiral. Here’s the resulting code. The takeaway is that functional programs are structured differently – one could do some things more naturally. It feels closer to how the mind works. As long as one get the base cases right, one can build large amount of complexity trivially. On the other hand, if one has to start top down and must debug a large call stack, it could be challenging.

// rt annotated spiral program.
// source http://www.kaiyin.co.vu/2015/10/draw-plain-text-spiral-in-scala.html
// reference: http://www.cis.upenn.edu/~matuszek/Concise%20Guides/Concise%20Scala.html
// syntax highlight: http://bsnyderblog.blogspot.com/2012/12/vim-syntax-highlighting-for-scala-bash.html
import java.io.{File, PrintWriter}
import scala.io.Source

object SpiralObj {   // object keyword => a singleton object of a class defined implicitly by the same name
  object Element {   // subclass. how is element a singleton ? there are several elements. has 3 subclasses which are not singetons
    private class ArrayElement(  // subsubclass, not a singleton
                                val contents: Array[String]  // "primary constructor" is defined in class declaration, must be called
                                ) extends Element
    private class LineElement(s: String) extends Element {
      val contents = Array(s)
    }
    private class UniformElement(  // height and width of a line segment. what if we raise width to 2. works.
                                  ch: Char,
                                  override val width: Int,   // override keyword is required to override an inherited method
                                  override val height: Int
                                  ) extends Element {
      private val line = ch.toString * width  // fills the characters in a line
      def contents = Array.fill(height)(line) // duplicates line n(=height) times, to create a width*height rectangle
    }
    // three constructor like methods
    def elem(contents: Array[String]): Element = {
      new ArrayElement(contents)
    }
    def elem(s: String): Element = {
      new ArrayElement(Array(s))
    }
    def elem(chr: Char, width: Int, height: Int): Element = {
      new UniformElement(chr, width, height)
    }
  }


  abstract class Element {
    import Element.elem
    // contents to be implemented
    def contents: Array[String]

    def width: Int = contents(0).length

    def height: Int = contents.length

    // prepend this to that, so it appears above
    def above(that: Element): Element = {      // above uses widen
      val this1 = this widen that.width
      val that1 = that widen this.width
      elem(this1.contents ++ that1.contents)
    }

    // prefix new bar line by line
    def beside(that: Element): Element = {     // beside uses heighten
      val this1 = this heighten that.height
      val that1 = that heighten this.height
      elem(
        for ((line1, line2) <- this1.contents zip that1.contents)
          yield line1 + line2
      )
    }

    // add padding above and below
    def heighten(h: Int): Element = {          // heighten uses above
      if (h <= height) this
      else {
        val top = elem(' ', width, (h - height) / 2)
        val bottom = elem(' ', width, h - height - top.height)
        top above this above bottom
      }
    }

    // add padding left and right
    def widen(w: Int): Element = {             // widen uses beside
      if (w <= width) this
      else {
        val left = elem(' ', (w - width) / 2, height)
        val right = elem(' ', w - width - left.width, height)
        left beside this beside right
      }
    }

    override def toString = contents mkString "\n"
  }


  object Spiral {
    import Element._
    val space = elem("*")
    val corner1 = elem("/")
    val corner2 = elem("\\")
    def spiral(nEdges: Int, direction: Int): Element = { // clockwise spiral
      if(nEdges == 0) elem("+")
      else {
        //val sp = spiral(nEdges - 1, (direction + 1) % 4) // or (direction - 1) % 4, but we don't want negative numbers
        val sp = spiral(nEdges - 1, (direction + 3) % 4) // or (direction - 1) % 4, but we don't want negative numbers
        var verticalBar = elem('|', 1, sp.height)        // vertBar and horizBar have last two params order switched
        var horizontalBar = elem('-', sp.width, 1)
    val thick = 1
        // at this stage, assume the n-1th spiral exists and you are adding another "line" to it (not a whole round)
        // use "above" and "beside" operators to attach the line to the spiral
        if(direction == 0) {
          horizontalBar = elem('r', sp.width, thick)
          (corner1 beside horizontalBar) above (sp beside space) //  order is left to right
        }else if(direction == 1) {
          verticalBar = elem('d',thick, sp.height)
          (sp above space) beside (corner2 above verticalBar)
        } else if(direction == 2) {
          horizontalBar = elem('l', sp.width, thick)
          (space beside sp) above (horizontalBar beside corner1)
        } else {
          verticalBar = elem('u',thick, sp.height)
          (verticalBar above corner2) beside (space above sp)
        }
      }
    }

    def revspiral(nEdges: Int, direction: Int): Element = { // try counterclockwise
      if(nEdges == 0) elem("+")
      else {
        //val sp = spiral(nEdges - 1, (direction + 1) % 4) // or (direction - 1) % 4, but we don't want negative numbers
        val sp = revspiral(nEdges - 1, (direction + 3) % 4) // or (direction - 1) % 4, but we don't want negative numbers
        var verticalBar = elem('|', 1, sp.height)        // vertBar and horizBar have last two params order switched
        var horizontalBar = elem('-', sp.width, 1)
    val thick = 1
        // at this stage, assume the n-1th spiral exists and you are adding another "line" to it (not a whole round)
        if(direction == 0) { // right
          horizontalBar = elem('r', sp.width, thick)
          (sp beside space) above (corner2 beside horizontalBar)
        }else if(direction == 1) { // up
          verticalBar = elem('u',thick, sp.height)
          (space above sp) beside (verticalBar above corner1)
        } else if(direction == 2) { // left
          horizontalBar = elem('l', sp.width, thick)
          (horizontalBar beside corner2 ) above (space beside sp) 
        } else { // down
          verticalBar = elem('d',thick, sp.height)
          (corner1 above verticalBar) beside (sp above space)
        }
      }
    }
    def draw(n: Int): Unit = {
      println()
      println(spiral(n, n % 4))  // %4 returns 0,1,2,3 .    right, down, left, up
      println()
      println(revspiral(n, n % 4))  // %4 returns 0,1,2,3   
    }
  }
}

object Main {
  def usage() {
      print("usage: scala Main szInt");
  }

  def main(args: Array[String]) {
    
    import SpiralObj._
    if(args.length > 0) {
        val spsize = args(0)
        Spiral.draw(spsize.toInt)
    } else {
    usage()
        println()
    }
  }
}

A note on tail-call recursion. If the last statement of function is a call to another function, then the return position of the called function is the same as that of the calling function. The current stack position is valid for the called function. Such a function is tail recursive and the effect is that of a loop – a series of function calls can be made without consuming stack space.

On Software Requirements

There are a couple high level tradeoffs in the requirements specification process. Each tradeoff can be thought as an axis: Specificity (detailed vs vague), Audacity (visionary vs trivial/checkmark), Customer-driven (needs vs wants; with timelines).

It is possible for them to be too detailed – the more detailed and specific the requirements are, the less understandable they are and the less flexible they are in a rapidly changing context. But if the requirements are too vague, then they are likely to be misunderstood or ignored by a development team. This is a case where directly talking to the end users and clear communication between team members to flesh out use cases will help.

Also if the requirements are too visionary then they may appear infeasible to the team.  Showing they are achievable by looking at related products is one solution. Decomposing the target into achievable modules is another. If they are too near-term, then they may appear trivial and fail to excite the team.

Finally the requirements should be well grounded in customer use cases and narrowly stated, rather than inherited as a long list from past successful technical products. This is probably the most important and hardest thing in practice.

Specifying the right amount of detail for development targets that are grounded, challenging and achievable is an important skill.

Another take on this topic is Joel Spolsky’s series on writing painless functional specifications.