Tag: software

Machine Learning Security

Seven security concerns in Machine Learning (ML) –

  1. Data privacy and security: ML requires large amounts of data to be trained, and this data may contain sensitive or personal information. Appropriate measures need to be put in place to prevent data from being accessed by unauthorized parties.
  2. Notebooks security: ML typically requires Jupyter or similar notebooks to be served for data scientists to work on data, code, and models, both individually and collaboratively. These notebooks need to be access controlled and protected from unauthorized access. This includes the code and git repos that host the code, and the model artifacts that the notebook uses or creates.
  3. Model serving and inference security: ML models in production are commonly served and accessed over inference endpoints and such endpoints need authentication, authorization, encryption for protection against misuse. During model upgrades to an endpoint or changes to an endpoint and its configuration, a number of attacks are possible that are typical of a devops/devsecops pipeline. These need to be protected against.
  4. Model security: Models can be vulnerable to attacks such as adversarial inputs, such as when an attacker intentionally manipulates the input to the model in order to cause it to make incorrect predictions. Another example is when the model makes an egregiously bad decision on an input, for example a self-driving car hitting an obstacle instead of avoiding it. It is important to harden the model and bound the decisions that come from its use.
  5. Misuse: Even if a model works as designed, it can be misused, for example by generating fake or misleading content. It is important to consider the potential unintended consequences of using models and to put safeguards in place to prevent their misuse.
  6. Bias: ML models can sometimes exhibit biases due to the data they are trained on. There should be a plan to identify biases in a model and take steps to mitigate them.
  7. Intellectual property: ML models may be protected by intellectual property laws, and it is important to respect these laws and obtain the appropriate licenses when using language models developed by others.

SolarWinds incident overview

SolarWinds makes software for managing networks and infrastructure. Its Orion software was the target of an advanced cyberattack in 2020. Hackers acquired superuser access to certificates used to sign SAML tokens. This certificate was used to forge new tokens to allow hackers highly privileged access to networks.

Attackers may have compromised internal build or distribution systems of SolarWinds, embedding backdoor code into a legitimate SolarWinds library with the file name SolarWinds.Orion.Core.BusinessLayer.dll. This backdoor could then be distributed via automatic updates in target networks.

The malicious DLL called out to a remote network infrastructure using the domain avsvmcloud.com. to prepare possible second-stage payloads, move laterally in the organization, and compromise or exfiltrate data.

The Cybersecurity and Infrastructure Security Agency issued Emergency Directive 21–01 in response to the incident, advising all federal civilian agencies to disable Orion.

SolarWinds Sunburst attack network paths (source)

Ref. https://web.archive.org/web/20201220053318/https://msrc-blog.microsoft.com/2020/12/13/customer-guidance-on-recent-nation-state-cyber-attacks/

On Software Requirements

There are a couple high level tradeoffs in the requirements specification process. Each tradeoff can be thought as an axis: Specificity (detailed vs vague), Audacity (visionary vs trivial/checkmark), Customer-driven (needs vs wants; with timelines).

It is possible for them to be too detailed – the more detailed and specific the requirements are, the less understandable they are and the less flexible they are in a rapidly changing context. But if the requirements are too vague, then they are likely to be misunderstood or ignored by a development team. This is a case where directly talking to the end users and clear communication between team members to flesh out use cases will help.

Also if the requirements are too visionary then they may appear infeasible to the team.  Showing they are achievable by looking at related products is one solution. Decomposing the target into achievable modules is another. If they are too near-term, then they may appear trivial and fail to excite the team.

Finally the requirements should be well grounded in customer use cases and narrowly stated, rather than inherited as a long list from past successful technical products. This is probably the most important and hardest thing in practice.

Specifying the right amount of detail for development targets that are grounded, challenging and achievable is an important skill.

Another take on this topic is Joel Spolsky’s series on writing painless functional specifications.