1. Sec-DevOps
1.1 Concepts
1.1.1 SecDevOps or DevSecOps
- Which one is better
- Which one sends the better message
- DevSecOps seems to be getting a couple more google queries : https://www.google.co.uk/trends/explore?q=secdevops,devsecops
- DevSecOps has the advantage that puts Dev (Development) first (which is good since that is most important part)
- SecDevOps is a good extension of DevOps which is already a known practice
- I also like the idea that when Security becomes embedded in the SDL and Sec-DevOps just becomes DevOps
- I really like the definitions here
- DevSecOps & SecDevOps - Is there a difference? https://www.linkedin.com/pulse/devsecops-secdevops-difference-kumar-mba-msc-cissp-mbcs-citp
1.1.2 Don’t blame the developers
A very common and dangerous misconception in the security world is that developers are responsible for any security vulnerabilities, or even security mistakes.
The developers wrote the code, but it is essential to understand the ecosystem and the environment that produced that code.
Most developers don’t control their schedule, their learning, or how much time they get to visualize and understand what they do. Therefore, to make them responsible for the code they are writing is a very dangerous and short-sighted analysis of their role.
The key influencers of code are the development and the business managers, and the business owner, who controls what gets done and is responsible for what gets created.
The other challenge is that the current development frameworks and IDEs (Integrated Development Environments) don’t reward the visualization and understanding of the side effects of code.
This means that when you are programming, it is very hard to understand the side effects of changing or adding a feature, or making a particular kind of implementation.
In such circumstances, how can we expect developers to be accountable and responsible for those code changes when they aren’t aware of the bigger picture?
There might be some cases where the developers should know better, but in my experience, those cases are very rare. My experience is that developers want to do the right thing but they don’t get the tools or even the allocation to do so.
Blaming developers creates a negative dynamic between security people and developers in an already adversarial blame culture. What we need to do before blaming the developers, is understand the ecosystem, and look at the reward system and the workflow. This is a much more positive way of looking at things.
1.1.3 Good resources on DevSecOps and SecDevOps
DevSecOps & SecDevOps - Is there a difference? https://www.linkedin.com/pulse/devsecops-secdevops-difference-kumar-mba-msc-cissp-mbcs-citp
SecDevOps
- https://www.reddit.com/r/secdevops/
- https://twitter.com/hashtag/secdevops
- SecDevOps: Embracing the Speed of DevOps and Continuous Delivery in a Secure Environment https://securityintelligence.com/secdevops-embracing-the-speed-of-devops-and-continuous-delivery-in-a-secure-environment/
- SecDevOps: Injecting Security Into DevOps Processes https://blog.newrelic.com/2015/08/27/secdevops-rugged-devops/
- SecDevOps - Source Code Review Consultants (at speed) https://www.seek.com.au/Job/32037282?cid=advlinkedin
- The 12 Days of SecDevOps http://blog.threatstack.com/the-12-days-of-secdevops
- Advanced Cloud Security and applied SecDevOps https://www.blackhat.com/us-16/training/advanced-cloud-security-and-applied-secdevops.html
- Swimming application security upstream with SecDevOps http://www.esg-global.com/blog/swimming-application-security-upstream-with-secdevops
- https://devops.com/tag/secdevops/
- Murphy’s DevOps: Is Security Causing Things to Go Wrong? https://devops.com/2016/02/29/murphys-devops-is-security-causing-things-to-go-wrong/
- Flash Mob Inflection: Rugged DevOps Revolution https://devops.com/2016/02/19/flash-mob-inflection-rugged-devops-revolution/
- Security Breaks DevOps – Here’s How to Fix It https://devops.com/2015/07/08/security-breaks-devops-heres-how-to-fix-it/
- The devOpsSec Dilemma: Effective Strategies for Social Networking https://devops.com/2015/05/28/devopssec-dilemma-effective-strategies-social-networking/
- Automated Security Testing in a Continuous Delivery Pipeline https://devops.com/2015/04/06/automated-security-testing-continuous-delivery-pipeline/
- It’s time security pros shake their DevOps fear, uncertainly, and doubt https://devops.com/2015/05/18/its-time-security-pros-shake-their-devops-fear-uncertainly-and-doubt/
- ChatOps: Communicating at the speed of DevOps https://devops.com/2014/07/16/chatops-communicating-speed-devops/
- SecDevOps: The New Black of IT http://www.slideshare.net/CloudPassage/sec-devops-webinar-deck
DevSecOps
- https://github.com/devsecops/awesome-devsecops (more links)
- https://twitter.com/hashtag/devsecops
- http://www.devsecops.org/
- http://www.devseccon.com/
- A primer on secure DevOps: Why DevSecOps matters http://techbeacon.com/devsecops-foundations
- Why Did We Need to Invent DevSecOps? http://blog.threatstack.com/why-did-we-need-to-invent-devsecops
- DevOpsSec Securing Software through Continuous Delivery http://www.oreilly.com/webops-perf/free/devopssec.csp
- DevSecOps: The Marriage of SecOps and DevOps www.tripwire.com/state-of-security/security-awareness/devsecops-the-marriage-of-secops-and-devops/
- A Look Back at DevOpsDays Austin 2016 http://blog.threatstack.com/a-look-back-at-devopsdays-austin-2016
- MASTERED DEVOPS? WHAT’S NEXT? DEVSECOPS, THAT’S WHAT! https://www.cloudreach.com/us-en/2014/11/devops-devsecops/?topic=aws
- Gartner: DevOps is good; DevSecOps is better http://searchcio.techtarget.com/tip/Gartner-DevOps-is-good-DevSecOps-is-better
1.1.4 History of Sec-DevOps
1.1.5 Making the Sec part invisble
- concept that a lot of the work done today (2016) in AppSec and SecDevOps is part of a transition into just DevOps
- when security in embedded into the Dev SDL
- when security is invisible
- Case-Study: The biggest advantages of Microsoft’s Security push has been the quality and robustness of their products, and big improvement on their SDL Story: IE body count from Renegades of the Empire book)
- add reference (and some content from) Secure coding (and Application Security) must be invisible to developers
1.1.6 Rugged Software
- https://www.ruggedsoftware.org/
- Rugged Manifesto
- add explanation of what it is (and its history)
- why it didn’t really work (as least according to the original expectations)
- lack of community adoption
- ‘Security Driven’ vs ‘Developer driver’
- The Docker case study
- why Docker was so successful (as an idea and adoption)
- lessons learned
1.1.7 Using Artificial Intelligence for proactive defense
We need AI to understand code and applications. Our code complexity is getting to a level that we need to start to use artificial intelligence capabilities to understand it, and to get a grasp of what is going on, so we can create secure applications that have no unintended side effects.
As AI becomes much more commonplace, we should start to use it more to source code analysis and application analysis. Kevin Kelly has some very interesting analysis on the use of AI, where he discusses the idea that one of the next major revolutions will be where we start adding AI to everything, because the cost of AI will become so low that we will be able to add AI to many devices.
When you analyse an app, you should use everything you have. You should use static, dynamic, interactive, human, and increasingly you should use artificial intelligence to optimise your analysis.
When you are doing security analysis, you are dealing with a vast amount of data, displayed on a multi-dimensional graph. What you have is a graph of the relationships, of what is happening. You are looking for the connections, for the paths within the graph, that are made of what is really going on and what is possible.
Artificial intelligence technology can assist the human who will put context on those connections. I think we are a long way from being able to do this kind of analysis automatically, but if we can make the human’s job of reviewing the results easier, or even possible, that is a major step forward.
1.1.8 When Failed Tests are Good
When you make a code change, it is fundamental that every change you make breaks a test, or breaks something. You are testing for that behaviour; you are testing for the particular action that you are changing.
This means you should be happy when you make a change and the test fails, because you can draw confidence from knowing the side effects of the change. If you make a test change in one place and a couple of tests break, and you make another test change in a different place and fifty tests break, you get a far better sense of the impact of the changes that you made.
A more worrying prospect is when you make code changes but you don’t see any test failing, and nothing breaks. This means you don’t understand the side effects of the change that you just made. And tests should be teaching you the side effects of changes.
Broken tests are great when the test that you expect to break is the one that fails. The changes that you were expecting to make are the ones that happen, and that makes sense.
When a test breaks, and you review that code, you understand why the break happened, and you understand the side effect. If the test fix is fast, you have a very quick and effective loop of TDD.
Sometimes, I will write something and it passes, so I know the code is working the way it is supposed to work. But I will have a couple of other cases where the test fails, and this allows me to confirm that it is failing in a predictable and expected way.
Sometimes, I will codify those failures in tests, to have and to give the assurance that a break happened in a particular place, so I should change this place, or this is the fix that happened the way that I expected it to happen.
1.1.9 Why SecDevOps?
I like SecDevOps because it reinforces the idea that is an extension of DevOps. SecDevOps points to the objective that eventually we want the Sec part to disappear and leave us with DevOps.
Ultimately, we want an environment where 99% of the time, DevOps don’t care about security. They program in an environment where they either cannot create security vulnerabilities, or it is hard to create them, or it is easy to detect security issues when they occur.
This doesn’t mean you don’t need security teams, or AppSec experts. It doesn’t mean you don’t need a huge amount of work behind the scenes, and a huge amount of technology to create those environments.
You don’t make security invisible by getting rid of it. You make security invisible by automating the security checks, and by increasing visibility into what is going on.
At the moment, when we look at security activities, we often see security doing things that are the proper responsibility of development, or testing, or infrastructure, or documentation, or even management.
Anybody who works in AppSec for a while always finds themselves asking difficult questions. They interrogate the system rigorously, but the information they seek should already be known and available to them.
AppSec will often create tools to attack an application to visualize what is going on. I have had many experiences of spending time creating technology to understand the attack surface. Once that task is complete, I find a huge number of vulnerabilities, simply because a significant part of the application hadn’t been tested. The system owners, and the various players didn’t fully understand the application or how it behaved.
So, I think SecDevOps represents an interesting point in history, where we are trying to merge all the activities that have been going on in security with the activities that have been going on in DevOps so we can build an environment where we can create and deploy secure applications.
This relates closely to my ideas about using AppSec to measure quality. Ultimately, the quality of the application is paramount when we are trying to eliminate the unintended consequences of software.
DevSecOps initially sounds better because development goes first. But I agree with the view of DevSecOps as being more about applying development to security operations (SecOps).
This all ties together with the risk workflows that make things more connected and accountable.
1.1.10 Draft notes - DevOps
Stages of AppSec automation
Start with static analysis which don’t need a live environment to deploy the application
- Have a CI that builds code automatically for every commit of every branch
- Ran ‘a’ tool after build (doesn’t really matter which one what matters is that it uses the materials created on step 1)
- use for example FindBugs with FindSecBugs 2
- Find tune scan targets (i.e. exactly what needs to be scanned)
- Filter results, customize rules to reduce noise
- Gather reports and scans and put them on git repo
- create a consolidated dashboard (for management and business owners)
- add more tools
- loop from 5
after this is mature, add a step to deploy the app to a live environment (or simulator)
1.2 Dev-Ops
1.2.1 Disposable IT infrastructure
Following the ‘everything is code’ concept, what you really want is an environment where your IT infrastructure is disposable. By this I mean that it should be very easy to rebuild, or recreate. Consider the developer’s environment, and how long it takes to create infrastructure. Then consider what happens to that infrastructure when the company hires more developers or the development team expands.
This is also an issue from a security point of view because it means that the developers, and even normal IT users, have a lot of black magic in their infrastructure. And if there is any Malware or malicious code installed in an application, then it means that it is in there for a certain period of time.
So, what you want is a situation where most of your infrastructure is automatically rebuilt. You want environments where developers’ laptops reboot every Monday, where you overhaul the infrastructure from the bottom up, and where business owners use Chromebooks so that every install is fresh.
Your infrastructure should be disposable, you shouldn’t care about it, and you should be able to easily rebuild, delete, or destroy, because that means all your data is backed up, and all your data is safe.
It also promotes the idea that in most scenarios, you shouldn’t be able to access all the data or assets that your current user role has access to (i.e. you should only have access to that you need to do the job at hand).
It would be great if we had a Git-based operating system, with native support for version control, even at the operating system level. This would provide full control of what is going on and what is changed (on install and over time), by using git diffs and branches.
1.2.2 Don’t kill the Ops
- be careful that DevOps is not a cost saving exercise, where it is seen as a way to kill/reduce Ops.
- Dev + Ops -> DevOps -> Dev
- seen cases where there is an crazy idea that the ‘Ops’ team will be made redundant
1.2.3 Everything is code
It is critical to understand that everything that happens in an organisation, from the development, to the deployment, to the configuration, to the retirement of an app, is code.
This is tied to the concept of disposable infrastructure, where it is possible to rebuild everything from scratch automatically.
Since everything is code, everything should be versioned, stored in a Git repository, and tested.
Git is a very important component, since it allows the use of branches for managing multiple versions and deployments. The Git repo also captures all the configurations, settings, and mappings.
This is a big change, because a lot of what happens in any organisation is not documented, but instead is stored in somebody’s head, or in a script. The movement to use continuous integration workflows and embrace DevOps practices, represents a good opportunity to capture existing reality.
Basically, there should be no black-magic or non-versioned configuration, anywhere in the development and deployment pipeline.
1.2.4 History of DevOps
1.2.5 Horizontal DevOps
The best model I have seen for integrating DevOps in a company is where teams are created that span multiple groups. Instead of having a top-down approach to the deployment of operations, where you create the central teams, the standards, the bills, etc., and then push it down, the central DevOps team hires or trains DevOps engineers and then allocates them to each team.
The logic is that each team spends a certain amount of time with a DevOp engineer, who trains the teams in DevOp activities and best practices, and thereby embeds the best practices in the development life cycle.
The result is horizontal teams, which have several advantages. They have two masters: they answer to DevOps but they also answer to the team lead, and they share best practice. The creation of horizontal teams is a much better way to scale, and it encourages teams to collaborate. The teams know they aren’t required to spend all their time working in DevOps, and they know there is someone who can help them.
1.2.6 In DevOps Everything is Code
A common gap in DevOps workflows is, ironically, Application Security activities on the code the DevOps team writes (Secure coding, Static/Dynamic analysis, Threat Models, Security Reviews, Secure Coding Guidelines, Security Champions, Risk Workflows, etc…)
One cause for this gap is the fact that many DevOps teams come from network and infrastructure backgrounds, or network security backgrounds (i.e. traditional InfoSec), rather than from development (i.e. coding).
This leads to the lack of realization that every single configuration change, or environment setup scripts such as Chef/Puppet/Ansible files or AWS/Azure/G-Cloud setting, that exist in a DevOps pipeline, are actually code, and these need to be:
- versioned
- reviewed
- tested
- released
- rolled back
- and finally retired/deleted
This ‘DevOps code lifecycle’ needs everything that we talk about in AppSec.
What makes this ‘DevOps code’ even more important than ‘normal code’, is the fact that ‘DevOps code’ tends to run with full admin privileges. Any vulnerability in this code, any exploit or blind spot, any lack of settings or even malicious changes, will have a tremendous impact on the company’s risk profile.
You need to look at the build servers (Jenkins, Bamboo, TeamCity, Travis) and pay attention to what its status is, and what code is running on it.
I like the idea of a pristine build environment, completely isolated from other build servers and networked devices, and supported and maintained by the DevOps team. In there, application builds are created very cleanly, with very few side effects, and with a full understanding of what is going on. Ideally, the build service should have read-only access to certain dependencies, because it should not be modifying them.
The key paradigm shift here is to realize that we need all the AppSec practices in everything that happens in the DevOps world. A good example is an identity solution I participated in where the code itself, in isolation, was rock solid.
But when the code went to production it became a horror show. There were bugs that only manifested themselves after being deployed. Functionality was missing, and at the end, the original brief was very far from where it was supposed to be.
The main problem was a major lack of integration testing, namely end-to-end testing where you can test code as it happens in production.
This is the power of DevOps.
You need to be able to take everything apart, and rebuild everything that is interconnected. You can use surrogate dependencies to mock-up certain external dependencies, but the idea is to have as much code as possible running at any given time.
To make all this work, the developer needs to have access to reliable and rebuildable environments. The developer needs a development workflow that proceeds from running a change that executes in a purely traditional unit test, to full-blown integration tests running on live instances.
From here the code/configuration change can proceed from running from a single machine, to a local or quasi-local environment, to the cloud environment that runs with the back-end and front-end components, and with all the components integrating.
1.2.7 Infrastructure as code
- all changes and scripts and clicks:
- are code
- need to be stored in git
- need tests (can run locally and in production)
1.2.8 Patch speed as a quality metric
Making small changes to an application (i.e. Patching) should be an easy and smooth event.
However, if it is a problem, it means that there are issues either in deploying the app, or in rebuilding the required infrastructure.
These issues need to be addressed sooner rather than later, specially since they affect the risk appetite for management and business owners to allow changes and refactoring to occur.
For patching, you really want to seen an Kanban workflow with a healthy fast, smooth flow and an low WIP (Work In Progress)
What happens when rollbacks are required?
Before an incident even happens, open a JIRA ticket on ‘lack of patching’. This way, when a lack of patching causes problems, you are ready to capture the incidents that occur.
History, and any experience of Murphy’s Law, should provide evidence on the cost of non-patching. A lack of patching acts as a canary in a coal mine, in so far as it points to bigger problems up ahead. You should think of patching as a type of fire drill.
When is patching easy
Patching is easy in the following circumstances:
- when it is easy to run the new version in an isolated environment (with and without production data)
- when there is a good understanding of what is going to change (files, ports, behavior, inter-dependencies, schema’s changes)
- i.e. a diff with the current version of everything
- when there are Tests (e2e and integration) that run though the affected systems’ success criteria and confirm any changes (i.e. the side effects)
- when it easy to rollback to previous version(s)
1.2.9 Performing root-cause analysis
- some environments make it hard to perform root-cause analysis, because it is seen as a blame exercise, instead of a learning opportunity
- Root-cause analysis are key for any bug (namely one the ones with Security implications)
- lack of ‘Root-cause analysis’ is a risk (which needs to be accepted)
- means business owners doesn’t want to spend the time to find other (similar issues)
- lack of ‘Root-cause analysis’ is a risk (which needs to be accepted)
- add story of project manager that asked “please don’t find more security issues during the retest (they are out of scope)”
- in an DevOps world, it is key for the root-cause analysis that there is a way replicate the issue (in an simulated environment).
- the end-result of a root-cause analysis should be a Test (very close to a script) (that passes when the issue can be replicated correctly and reliably)
1.2.10 Run Apps Offline
The ability to run applications offline, i.e. without live dependencies of QA servers, or even live servers, is critical in the development process. That capability allows the developers to code at enormous speed, because usually the big delays and expensive calls are to those services that allow all sorts of versioning, and all sorts of development techniques to occur. The ability to run your apps offline also signifies that the application development environment has matured to a level where you now have, or have created, mocked versions of your dependencies.
Ideally, the faster you can run the dependencies, even running them as real code, the better. The important thing is to be sure you are running them locally, without a network connection, and without an umbilical code to another system.
Of course you will always test against those other systems. The test that you are going to run locally against that mocked version should also pass against the live system. If the dependencies don’t pass, you have a problem and you have to fix it.
When you are developing, you should have all the required dependencies at that moment in time. This makes a huge difference when you are developing a new version, or a new feature, where you can already simulate how those dependencies behave. This allows for much better demos, a much faster interaction and development loop, and, ultimately, it accelerates developers’ productivity.
The answers to the following simple questions are also very important: Can you run your app offline? Can you run your service offline? Can you run your application that you are coding offline? These are things that management can use to measure the pulse, and the quality, of the current development environment, and the continuous integration of particular teams.
So, if you have three teams that code in the way described above, and one team that doesn’t, you can guess which team is probably not shipping regularly. This same team will be unable to develop things very quickly, and it will be unable to respond quickly to customer and business requirements. More importantly, the team that doesn’t run apps offline will be the team that has to deal with a whole number of bugs and QAs every time it releases. The teams that run their apps offline, with QA cycles, won’t encounter these kinds of obstacles.
1.2.11 When devs are cloud admins
- When moving to the cloud, Devs are becoming SysAdmins
- which is a massive issue and is creating a large number of security issues
- in some cases this move is literally ‘throwing the baby with the bath water’ where lack of innovation, speed and features from the local admin/IT teams, pushed the developers to have free reign in the cloud (like AWS) where they can create servers and VMs in seconds (vs days, weeks or months)
1.3 Sec DevOps patterns
- provide examples