The simple things, sometimes…

I (re-)learned an important lesson this week: if you’re an attacker, start at the front door.

This week I’ve had an interesting conversation with an organisation with which I’m involved*.  My involvement is as a volunteer, and has nothing to do with my day job – in other words, I have nothing to do with the organisation’s security.  However, I got an email from them telling me that in order to perform a particular action, I should now fill in an online form, which would then record the information that they needed.

So this week’s blog entry will be about entering information on an online form.  One of the simplest tasks that you might want to design – and secure – for any website.  I wish I could say that it’s going to be a happy tale.

I had look at this form, and then I looked at the URL they’d given me.  It wasn’t a fully qualified URL, in that it had no protocol component, so I copied and pasted it into a browser to find out what would happen. I had a hope that it might automatically redirect to an https-served page.  It didn’t.  It was an http-served page.

Well, not necessarily so bad, except that … it wanted some personal information.  Ah.

So, I cheated: I changed the http:// … to an https:// and tried again**.  And got an error.  The certificate was invalid.  So even if they changed the URL, it wasn’t going to help.

So what did I do?  I got in touch with my contact at the organisation, advising them that there was a possibility that they might be in breach of their obligations under Data Protection legislation.

I got a phone call a little later.  Not from a technical person – though there was a techie in the background.  They said that they’d spoken with the IT and security departments, and that there wasn’t a problem.  I disagreed, and tried to explain.

The first argument was whether there was any confidential information being entered.  They said that there was no linkage between the information being entered and the confidential information held in a separate system (I’m assuming database). So I stepped back, and asked about the first piece of information requested on the form: my name.  I tried a question: “Could the fact that I’m a member of this organisation be considered confidential in any situation?”

“Yes, it could.”

So, that’s one issue out of the way.

But it turns out that the information is stored encrypted on the organisation’s systems.  “Great,” I said, “but while it’s in transit, while it’s being transmitted to those systems, then somebody could read it.”

And this is where communication stopped.  I tried to explain that unless the information from the form is transmitted over https, then people could read it.  I tried to explain that if I sent it over my phone, then people at my mobile provider could read it.  I tried a simple example: I tried to explain that if I transmitted it from a laptop in a Starbucks, then people who run the Starbucks systems – or even possibly other Starbucks customers – could see it.  But I couldn’t get through.

In the end, I gave up.  It turns out that I can avoid using the form if I want to.  And the organisation is of the firm opinion that it’s not at risk: that all the data that is collected is safe.  It was quite clear that I wasn’t going to have an opportunity to argue this with their IT or security people: although I did try to explain that this is an area in which I have some expertise, they’re not going to let any Tom, Dick or Harry*** bother their IT people****.

There’s no real end to this story, other than to say that sometimes it’s the small stuff we need to worry about.  The issues that, as security professionals, we feel are cut and dried, are sometimes the places where there’s still lots of work to be done.  I wish it weren’t the case, because frankly, I’d like to spend my time educating people on the really tricky things, and explaining complex concepts around cryptographic protocols, trust domains and identity, but I (re-)learned an important lesson this week: if you’re an attacker, start at the front door.  It’s probably not even closed: let alone locked.


*I’m not going to identify the organisation: it wouldn’t be fair or appropriate.  Suffice to say that they should know about this sort of issue.

**I know: all the skillz!

***Or “J. Random User”.  Insert your preferred non-specific identifier here.

****I have some sympathy with this point of view: you don’t want to have all of their time taken up by random “experts”.  The problem is when there really _are_ problems.  And the people calling them maybe do know their thing.

“Zero-trust”: my love/hate relationship

… “explicit-trust networks” really is a much better way of describing what’s going on here.

A few weeks ago, I wrote a post called “What is trust?”, about how we need to be more precise about what we mean when we talk about trust in IT security.  I’m sure it’s case of confirmation bias*, but since then I’ve been noticing more and more references to “zero-trust networks”.  This both gladdens and annoys me, a set of conflicting emotions almost guaranteed to produce a new blog post.

Let’s start with the good things about the term.  “Zero-trust networks” are an attempt to describe an architectural approach which address the disappearance of macro-perimeters within the network.  In other words, people have realised that putting up a firewall or two between one network and another doesn’t have a huge amount of effect when traffic flows across an organisation – or between different organisations – are very complex and don’t just follow one or two easily defined – and easily defended – routes.  This problem is exacerbated when the routes are not only multiple – but also virtual.  I’m aware that all network traffic is virtual, of course, but in the old days**, even if you had multiple routing rules, ingress and egress of traffic all took place through a single physical box, which meant that this was a good place to put controls***.

These days (mythical as they were) have gone.  Not only do we have SDN (Software-Defined Networking) moving packets around via different routes willy-nilly, but networks are overwhelmingly porous.  Think about your “internal network”, and tell me that you don’t have desktops, laptops and mobile phones connected to it which have multiple links to other networks which don’t go through your corporate firewall.  Even if they don’t******, when they leave your network and go home for the night, those laptops and mobile phones – and those USB drives that were connected to the desktop machines – are free to roam the hinterlands of the Internet******* and connect to pretty much any system they want.

And it’s not just end-point devices, but components of the infrastructure which are much more likely to have – and need – multiple connections to different other components, some of which may be on your network, and some of which may not.  To confuse matters yet further, consider the “Rise of the Cloud”, which means that some of these components may start on “your” network, but may migrate – possibly in real time – to a completely different network.  The rise of micro-services (see my recent post describing the basics of containers) further exacerbates the problem, as placement of components seems to become irrelevant, so you have an ever-growing (and, if you’re not careful, exponentially-growing) number of flows around the various components which comprise your application infrastructure.

What the idea of “zero-trust networks” says about this – and rightly – is that a classical, perimeter-based firewall approach becomes pretty much irrelevant in this context.  There are so many flows, in so many directions, between so many components, which are so fluid, that there’s no way that you can place firewalls between all of them.  Instead, it says, each component should be responsible for controlling the data that flows in and out of itself, and should that it has no trust for any other component with which it may be communicating.

I have no problem with the starting point for this – which is as far as some vendors and architects take it: all users should always be authenticated to any system, and auhorised before they access any service provided by that system. In fact, I’m even more in favour of extending this principle to components on the network: it absolutely makes sense that a component should control access its services with API controls.  This way, we can build distributed systems made of micro-services or similar components which can be managed in ways which protect the data and services that they provide.

And there’s where the problem arises.  Two words: “be managed”.

In order to make this work, there needs to be one or more policy-dictating components (let’s call them policy engines) from which other components can derive their policy for enforcing controls.  The client components must have a level of trust in these policy engines so that they can decide what level of trust they should have in the other components with which they communicate.

This exposes a concomitant issue: these components are not, in fact, in charge of making the decisions about who they trust – which is how “zero-trust networks” are often defined.  They may be in charge of enforcing these decisions, but not the policy with regards to the enforcement.  It’s like a series of military camps: sentries may control who enters and exits (enforcement), but those sentries apply orders that they’ve been given (policies) in order to make those decisions.

Here, then, is what I don’t like about “zero-trust networks” in a few nutshells:

  1. although components may start from a position of little trust in other components, that moves to a position of known trust rather than maintaining a level of “zero-trust”
  2. components do not decide what other components to trust – they enforce policies that they have been given
  3. components absolutely do have to trust some other components – the policy engines – or there’s no way to bootstrap the system, nor to enforce policies.

I know it’s not so snappy, but “explicit-trust networks” really is a much better way of describing what’s going on here.  What I do prefer about this description is it’s a great starting point to think about trust domains.  I love trust domains, because they allow you to talk about how to describe shared policy between various components, and that’s what you really want to do in the sort of architecture that’s I’ve talked about above.  Trust domains allow you to talk about issues such as how placement of components is often not irrelevant, about how you bootstrap your distributed systems, about how components are not, in the end, responsible for making decisions about how much they trust other components, or what they trust those other components to do.

So, it looks like I’m going to have to sit down soon and actually write about trust domains.  I’ll keep you posted.

 


*one of my favourite cognitive failures

**the mythical days that my children believe in, where people have bouffant hairdos, the Internet could fit on a single Winchester disk, and Linux Torvalds still lived in Finland.

***of course, there was no such perfect time – all I should need to say to convince you is one word: “Joshua”****

****yes, this is another filmic***** reference.

*****why, oh why doesn’t my spell-checker recognise this word?

******or you think they don’t – they do.

*******and the “Dark Web”: ooooOOOOoooo.

Embracing fallibility

History repeats itself because no one was listening the first time. (Anonymous)

We’re all fallible.  You’re fallible, he’s fallible, she’s fallible, I’m fallible*.  We all get things wrong from time to time, and the generally accepted “modern” management approach is that it’s OK to fail – “fail early, fail often” – as long as you learn from your mistakes.  In fact, there’s a growing view that if you’d don’t fail, you can’t learn – or that your learning will be slower, and restricted.

The problem with some fields – and IT security is one of them – is that failing can be a very bad thing, with lots of very unexpected consequences.  This is particularly true for operational security, but the same can be the case for application, infrastructure or feature security.  In fact, one of the few expected consequences is that call to visit your boss once things are over, so that you can find out how many days*** you still have left with your organisation.  But if we are to be able to make mistakes**** and learn from them, we need to find ways to allow failure to happen without catastrophic consequences to our organisations (and our careers).

The first thing to be aware of is that we can learn from other people’s mistakes.  There’s a famous aphorism, supposedly first said by George Santayana and often translated as “Those who cannot learn from history are doomed to repeat it.”  I quite like the alternative:  “History repeats itself because no one was listening the first time.”  So, let’s listen, and let’s consider how to learn from other people’s mistakes (and our own).  The classic way of thinking about this is by following “best practices”, but I have a couple of problems with this phrase.  The first is that very rarely can you be certain that the context in which you’re operating is exactly the same as that of those who framed these practices.  The other – possibly more important – is that “best” suggests the summit of possibilities: you can’t do better than best.  But we all know that many practices can indeed be improved on.  For that reason, I rather like the alternative, much-used at Intel Corporation, which is “BKMs”: Best Known Methods.  This suggests that there may well be better approaches waiting to be discovered.  It also talks about methods, which suggests to me more conscious activities than practices, which may become unconscious or uncritical followings of others.

What other opportunities are open to us to fail?  Well, to return to a theme which is dear to my heart, we can – and must – discuss with those within our organisations who run the business what levels of risk are appropriate, and explain that we know that mistakes can occur, so how can we mitigate against them and work around them?  And there’s the word “mitigate” – another approach is to consider managed degradation as one way to protect our organisations***** from the full impact of failure.

Another is to embrace methodologies which have failure as a key part of their philosophy.  The most obvious is Agile Programming, which can be extended to other disciplines, and, when combined with DevOps, allows not only for fast failure but fast correction of failures.  I plan to discuss DevOps – and DevSecOps, the practice of rolling security into DevOps – in more detail in a future post.

One last approach that springs to mind, and which should always be part of our arsenal, is defence in depth.  We should be assured that if one element of a system fails, that’s not the end of the whole kit and caboodle******.  That only works if we’ve thought about single points of failure, of course.

The approaches above are all well and good, but I’m not entirely convinced that any one of them – or a combination of them – gives us a complete enough picture that we can fully embrace “fail fast, fail often”.  There are other pieces, too, including testing, monitoring, and organisational cultural change – an important and often overlooked element – that need to be considered, but it feels to me that we have some way to go, still.  I’d be very interested to hear your thoughts and comments.

 


*my family is very clear on this point**.

**I’m trying to keep it from my manager.

***or if you’re very unlucky, minutes.

****amusingly, I first typed this word as “misteaks”.  You’ve got to love those Freudian slips.

*****and hence ourselves.

******no excuse – I just love the phrase.

 

 

Single Point of Failure

Avoiding cascade failures with systems thinking.

Let’s start with a story.  Way back in the mists of time*, I performed audits for an organisation which sent out cryptographic keys to its members.  These member audits involved checking multiple processes and systems, but the core one was this: the keys that were sent out were are really big deal, as they were the basis from which tens of thousands of other keys would be derived.  So, the main key that was sent out was really, really important, because if it got leaked, the person who got hold of it would have a chance to do many, many Bad Things[tm].

The main organisation thought that allowing people the possibility to do Bad Things[tm] wasn’t generally a good idea, so they had a rule.  You had to follow a procedure, which was this: they would send out this key in two separate parts, to be stored in two different physical safes, to be combined by two different people, reporting to two different managers, in a process split into to separate parts, which ensured that the two different key-holders could never see the other half of the key.  The two separate parts were sent out by separate couriers, so that nobody outside the main organisation, could ever get to see the two parts.  It was a good, and carefully thought out process.

So one of the first things I’d investigate, on arriving at a member company to perform an audit, would be how they managed their part of this process.  And, because they were generally fairly clued up, or wouldn’t have been allowed to have the keys in the first place, they’d explain how careful they were with the key components, and who reported to whom, and where the safes were, and back up plans for when the key holders were ill: all good stuff.  And then I’d ask: “And what happens when a courier arrives with the key component?”  To which they’d reply: “Oh, the mail room accepts the package.”  And then I’d ask “And when the second courier arrives with the second key component?”  And nine times out of ten, they’d answer: “Oh, the mail room accepts that package, too.”  And then we’d have a big chat.**

This is a classic example of a single point of failure.  Nobody designs systems with a single point of failure on purpose****, but they just creep in.  I’m using the word systems here in the same way I used it in my post Systems security – why it matters: in the sense of a bunch of different things working together, some of which are likely to be human, some of which are likely to be machine.  And it’s hard to work out where single points of failure are.  A good way to avoid them – or minimise their likelihood of occurrence – is to layer or overlap systems*****.  What is terrible is when two single points of failure are triggered at once, because they overlap.  From the little information available to us, this seems to be what happened to British Airways over the past weekend: they had a power failure, and then their backups didn’t work.  In other words, they had a cascade failure – one thing went wrong, and then, when another thing went wrong as well, everything fell over. This is terrible, and every IT professional out there ought be cringing a little bit inside at the thought that it might happen to them.******

How can you stop this happening?  It’s hard, really it is, because the really catastrophic failures only happen rarely – pretty much by definition. Here are some thoughts, though:

  • look at pinch points, where a single part of the system, human or machine, is doing heavy lifting – what happens when they fail?
  • look at complex processes with many interlocking pieces – what happens if one of them produces unexpected results (or none)?
  • look at processes with many actors – what happens if one or actor fails to do what is expected?
  • look at processes with a time element to them – what will happen if an element doesn’t produce results when expected?
  • try back-tracking, rather than forward-tracking.  We tend to think forwards, from input to output: try the opposite, and see what the key parts to any output are.  This may give unexpected realisations about critical inputs and associated components.

Last: don’t assume that your systems are safe.  Examine, monitor, test, remediate.  You might******* also have a good long hard think about managed degradation: it’s really going to help if things do go horribly wrong.

Oh – and good luck.


*around ten years ago.  It feels like a long time, anyway.

**because, in case you missed it, that meant that the person in charge of the mail room had access to both parts of the key.***

***which meant that they needed to change their policies, quickly, unless they wanted to fail the audit.

****I’m assuming that we’re all Good Guys and Gals[tm], right, and not the baddies?

*****the principle of defence in depth derives from this idea, though it’s only one way to do it.

******and we/you shouldn’t be basking in the schadenfreude.  No, indeed.

*******should.  Or even must.  Just do it.

 

“What is trust?”

I trust my brother and my sister with my life.

Academic discussions about trust abound*.  Particularly in the political and philosophical spheres, the issue of how people trust in institutions, and when and where they don’t, is an important topic of discussion, particularly in the current political climate.  Trust is also a concept which is very important within security, however, and not always well-defined or understood.  It’s central,to my understanding of what security means, and how I discuss it, so I’m going to spend this post trying to explain what I mean by “trust”.

Here’s my definition of trust, and three corollaries.

  • “Trust is the assurance that one entity holds that another will perform particular actions according to a specific expectation.”
  • My first corollary**: “Trust is always contextual.”
  • My second corollary:” One of the contexts for trust is always time”.
  • My third corollary: “Trust relationships are not symmetrical.”

Why do we need this set of definitions?  Surely we all know what trust is?

The problem is that whilst humans are very good at establishing trust with other humans (and sometimes betraying it), we tend to do so in a very intuitive – and therefore imprecise – way.  “I trust my brother” is all very well as a statement, and may well be true, but such a statement is always made contextually, and that context is usually implicit.  Let me provide an example.

I trust my brother and my sister with my life.  This is literally true for me, and you’ll notice that I’ve already contextualised the statement already: “with my life”.  Let’s be a little more precise.  My brother is a doctor, and my sister a trained scuba diving professional.  I would trust my brother to provide me with emergency medical aid, and I would trust my sister to service my diving gear****.  But I wouldn’t trust my brother to service my diving gear, nor my sister to provide me with emergency medical aid.  In fact, I need to be even more explicit, because there are times which I would trust my sister in the context of emergency medical aid: I’m sure she’d be more than capable of performing CPR, for example.  On the other hand, my brother is a paediatrician, not a surgeon, so I’d not be very confident about allowing him to perform an appendectomy on me.

Let’s look at what we’ve addressed.  First, we dealt with my definition:

  • the entities are me and my siblings;
  • the actions ranged from performing an emergency appendectomy to servicing my scuba gear;
  • the expectation was actually fairly complex, even in this simple example: it turns out that trusting someone “with my life” can mean a variety of things from performing specific actions to remedy an emergency medical conditions to performing actions which, if neglected or incorrectly carried out, could cause death in the future.

We also addressed the first corollary:

  • the contexts included my having a cardiac arrest, requiring an appendectomy, and planning to go scuba diving.

Let’s add time – the second corollary:

  • my sister has not recently renewed her diving instructor training, so I might feel that I have less trust in her to service my diving gear than I might have done five years ago.

The third corollary is so obvious in human trust relationships that we often ignore it, but it’s very clear in our examples:

  • I’m neither a doctor nor a trained scuba diving instructor, so my brother and my sister trust me neither to provide emergency medical care nor to service their scuba gear.******

What does this mean to us in the world of IT security?  It means that we need to be a lot more precise about trust, because humans come to this arena with a great many assumptions.  When we talk about a “trusted platform”, what does that mean?  It must surely mean that the platform is trusted by an entity (the workload?) to perform particular actions (provide processing time and memory?) whilst meeting particular expectations (not inspecting program memory? maintaining the integrity of data?).  The context of what we mean for a “trusted platform” is likely to be very different between a mobile phone, a military installation and an IoT gateway.  And that trust may erode over time (are patches applied? is there a higher likelihood that an attacker my have compromised the platform a day, a month or a year after the workload was provisioned to it?).

We should also never simply say, following the third corollary, that “these entities trust each other”.  A web server and a browser may have established trust relationships, for example, but these are not symmetrical.  The browser has  probably established with sufficient assurance for the person operating it to give up credit card details that the web server represents the provider of particular products and services.  The web server has probably established that the browser currently has permission to access the account of the user operating it.

Of course, we don’t need to be so explicit every time we make such a statement.  We can explain these relationships in definitions of documents, but we must be careful to clarify what the entities, the expectations, the actions, the contexts and possible changes in context.  Without this, we risk making dangerous assumptions about how these entities operate and what breakdowns in trust mean and could entail.


*Which makes me thinks of rabbits.

**I’m hoping that we can all agree on these – otherwise we may need to agree on a corollary bypass.***

***I’m sorry.

****I’m a scuba diver, too.  At least in theory.*****

*****Bringing up children is expensive and time-consuming, it turns out.

******I am, however, a trained CFR, so I hope they’d trust me to perform CPR on them.

Systems security – why it matters

… to understand how things will work together, you have to consider them as a system…

“A system is a set of interacting or interdependent component parts forming a complex or intricate whole.  Every system is delineated by its spatial and temporal boundaries, surrounded and influenced by its environment, described by its structure and purpose and expressed in its functioning.” (Wikipedia: system)

I’ve been involved with various types of security over the years, from features within products to storage, network and other communications security, and including stand-alone application security, cryptographic protocol design and other weird and wonderful issues like why you shouldn’t lose too much weight on holiday.*  That’s a subject for another post.  But what I keep coming back to is systems security.

And that’s because you can design all the security into a particular component that you like, you take as much care in coding it as you like, you can ensure that you compile is safely, you can test it to within an inch of its life, and ensure that it is deployed where and how you like – but if it’s part of a system, and that system has other holes, than you might as well not bother.  We** often talk about “the weakest link in the chain” as a way of pointing out that if you have a single problem in a set of components, that’s what will break.  That’s too simplistic an analogy***, though, as different components interact in different ways with each other, dependent on a variety of factors.

In order to understand how things will work together, you have to consider them as a system, to define what their behaviour as a system will be, and to architect the system with an understanding of the risks, threats and likely attackers that it will have to deal with in its lifetime.

Much of the content this blog may discuss components, but I hope that I’ll manage to explain their place in systems, and how they work together.  Join me: I should be fun****.


*that’s a subject for another post – it’ll be fun

**by which I mean the nebulous “security community”

***don’t start me on analogies

****another disclaimer – I think that security is fun.  Not everybody agrees.  I’m presuming that the fact that you’ve made it this far means that you are at least open to the suggestion.