Any security controls you use to build applications will eventually be broken (or fail). How do we build our systems in a way that security incidents won’t happen even if some components fail? How to prevent data leaks if penetration tests succeed? “Defense in depth” is a security engineering pattern that suggests building an independent set of security controls aimed at mitigating more risks even if the attacker crosses the outer perimeter. Anastasiia Voitova models the threats and risks for the modern distributed application, and improves it with multiple lines of defense. She overviews high-level patterns and tools from the security engineering world, and explain them for DevOps practitioners and architects. Discussion includes practical security engineering approaches, and covering security controls from complex encryption schemes to modern DevOps tools.
So, for next 45 minutes, “crypto” means cryptography, okay, okay? Not cryptocurrency, no blockchain. No. Cryptography, okay?
And, before we start, let me ask you, are there any security people here? Like blue team, red team, [inaudible 00:00:26] testers, auditors? Okay, dev ops, dev sec ops, maybe? Okay, people, who are you? Engineers? Dev ops? No? Yeah? Architects?
Who? PM? Why not? Okay. Okay. Okay. So, I’ll have just a couple of questions in my talk, a couple of slides with very security-related slides. But no worries. You can lie to me that you know the answer, okay?
So this talk is going to be a cryptography talk, but a very high-level cryptography talk. No Worries, no ciphers, no math. Okay. No ciphers, no math. Just security design. And we’re going to talk about building multi-layered defenses for your data.
So, through the whole domains of security, now we’re going to be focused on data security. Okay. Got it. Awesome.
My name is Anastasia, and I was developing mobile applications for ages, and at some point, I… That’s unexpected side effect. All good? Can I continue? Yeah? Okay.
At some point, I switched to security and I’m working as a security software engineer and now my title is product engineer, or chief product officer, if you want, at a company that makes security solutions.
So I kind of balance between a world of cryptography and shiny ciphers, modern ideas, yada yada, and applied engineering, people that don’t have PhD in cryptography, but still want to create good applications.
I work in a company called Cossack Labs, and Cossack Labs we came from R and D and military crypto. We kind of making tools with like military cryptic tools, we try to make them accessible for SMEs, for start jobs. And first we thought that, if we’re going to take more than cryptography and give people easy to usable cryptography, developers who’ll be very excited to use it and it’ll be fine, no. Is it working this way? No.
Instead of trying to teach developers to create more secure applications, we were like, “Okay, okay. We’ll create tools easy to use, hard to misuse.” So if you don’t have any security team at your company, still applying the right tools in the right way, we’ll be able to lower risk of security incidents, and higher the bar for attackers to grab your data.
Otherwise while in the presentation, we’re going to have a lot of links and a lot of charts, so the slides are already available online if it will be complicated for you to see all of them. All of them you can scan the QR code and have a slides on your phone.
Okay, good. This our plan, and for the next 40 minutes we’re going to review few steps. First of all I will explain the current tribulations about data security in the market, then we will take a look on risk [inaudible 00:03:58] off modern web architecture. This covers defense in depth idea, how to apply defense in depths for your data, and what is more important, which tools to use, because obviously.
First of all, have you heard defense in depth before? Do you know the term defense in depth? Raise your hand. You can lie to me, no worries. Have you read any of these books? No. Expected. So defense in depth is not something new. Basically, in cyber security there are no new terms, all these terms are from the war, from the warfare, so defense in depth. But defense in depth we use in computer science when we are talking about network security, about creating Internets, creating parameters, yada yada. And now we switch, like a pro would switch angle a little bit and we’re talking about defense in depth for data, not only for your network, not only for your infrastructure.
Why we care about data? Well, obviously. First of all, if we don’t care about data, we going to have regulations are pushing on us, right? The fines are crazy, competitors can steal our databases, you know all the things. And of course, users also are upset, which what I really like, because they market it, obviously. What I really like that regulations starts pushing on large companies, on large providers and if you as a company is using service by service provider, service providers start pushing on you. I’ll explain it a little bit later.
You know GDPR. And if someone will say to you GDPR is all about check boxes and emails, show them this slide, because GDPR is also about decryption. Article 32 and 35 is about data process and data encryption. Article 33/34 is about monitoring in case of data breach, in case of incident.
Article 32, taking into account yada yada yada yada, you should use pseudomenstruation and encryption of personal data. You the difference, right? You know what pseudomenstruation is. Lie to me. Yes, awesome. Right.
So if someone will tell you, “Okay, GDPR doesn’t has anything to do with encryption.” No no no, article 32, it does. But GDPR, it’s a regulation about human right, it’s not a technical law, and it doesn’t state what kind of ciphers to use, what kind of protocols to use.
By the way, in some countries, GDPR is like umbrella regulation, right? So in some countries, they created their own internal law, internal regulations based on GDPR with indicating exact ciphers. Like in Portugal for example, they have their own law, and they actually put what ciphers to use. And I don’t know, they put really all cipher there, not the state of the art. I don’t know how Portuguese companies, at least governmental companies in house handle it, but.
So, GDPR doesn’t says as anything about what kind of ciphers to use, and what kind of encryption to use, but US Department of Defense does. Last year they release a document, their document internal, how to build software correctly, how to build software that US Department of Defense would use. Right so software for themselves. And this document contains ten principles. So the software will be better and will be faster and the US Department of Defense will be cooler, than all their attackers and adversaries. It’s very interesting to read. There is a link to read here.
And ten principles, principle number 9 says that data should be encrypted all the time, except for short moments is being computed. Which means that TLS is not enough. Which means that data based encryption is not enough. Which means that TLS plus data based encryption is not enough. Data should be encrypted all the time except is being calculated, is being processed. So data should be encrypted per application, per user and only some trusted services can decrypt it.
Which actually sounds cool but, there are apps for that. I mean, it’s not something unrealistic, that’s totally realistic thing. And if you heard that one going to be released field level encryption, recently, yesterday, I don’t know I’m jet lagged, I don’t know which day was that. So literally is field level encryption, and they said like, “Wow, field level encryption, you know for each data base field, it’s so awesome, it’s so modern.” Come on people, [inaudible 00:09:34] already here. For example our things, which is open sourced and free is already three years in the market. It’s not so new, it’s not so modern, anyway.
Remember I told you about service providers that are pushing on companies that use them. For example Google, Google says, this is really fun, Google says that if you’re a company that use Google API to collect data of Google users and sell this data on your own servers, you must as a company, you must do security assessment and provide the results of security assessment to Google. And security assessment, Google expect that security assessment may cost between 15,000 and 75,000 dollars for you the company, and you pay this money, not to Google, no, to the companies that will do security assessment for you. Which Google will have, obviously will have some list of totally you know, independent companies.
Anyway, some of our customers receive these kind of emails and now they need to do the security assessment to show something toward the end of the year. And Google assumes this really pushy, I would say so. Apple, for example, they don’t do this push away, they just update security policies, very slightly and if you’re doing apps, if you’re uploading your apps to the App Store, you can see how data security policies are being changed. Now your application, if this application is for kids, you can’t use any ads, you can’t add any analytics or ad trackers to your app. So Apple is pushing these things in a more comfortable way.
And of course you have heard of OWASP Top-10, right? Lie to me.
Awesome. When we are talking about data protection, what exactly we are going to protect? Because there are different kind of data. We all know about user data, which is business data, which is reliable data for our business, okay easy. We often forget about service data, because PM’s usually don’t know that our services separate some kind of technical data, like API tokens and keys and data protection of these pieces are usually not a [inaudible 00:12:02] ticket.
And developers, they tend to forget that they have this data as well, but in the real world we operate different classes of data, and all of them we need to protect, because losing this data will have different impact on our company. Losing customer’s data, obviously it’s a compliance risk, legally we can refute, but losing technical data, losing service data means that someone can calculate bitcoins on our service sites, not so good as well.
Okay, this was very important slide. Customers, usually users, end users usually don’t have a way to understand how good you do data protection, so it’s your responsibility. Now let’s get to more technical side of things. If you want to make a photo, yeah. All good.
Okay, so charts, charts, charts, charts. This is our simplified web architecture. You know, a little bit simpler as you probably have in your apps. But what we have here is a data flow. So the data is being generated on a front end, on some web application, and it gets process on application backend and usually it’s not one server, it’s usually whole infrastructure. And then the data is being stored somewhere, right? And we can also show this as a layered model.
In a typical security design models, attackers are only outside and they will need to pass through all of our components to grab the data. You know there are different, dozens components to attack for each component, for low GS server for example. But some attackers are very special, like SQL injection. SQL injections, it’s like hyper jump because successful SQL injections means that attackers just jump to the database, grab data and jump back from the data base. Kind of omitting all our defenses.
Again, in the real world a lot of attacks occur because of insiders. I saw different numbers, depending on the industry, depending on reliable data in this industry is, the number vary from ten, fifteen percent of attacks made by insiders to 75. So when you design your application architecture, you should take into account all risks, including insiders risks. Including the point that insiders can actually be on every layer, like on any layer of your defenses.
Usually people tell me, “Hey Anastasia, okay, okay we know all the things, but we already know a lot of security controls, we already are prepared, I mean we are cool developers, we are cool dev ops. We are best dev ops. We know all the thing.” And I have a question for you. Yeah, I agree we know a lot of security controls, okay, okay. So these things that are actually real security controls, so real methods or tools or acronyms that I use in security, except one. Can you find the one?
You are really quick. Okay, okay, okay okay. Yeah, RTFM, okay okay. So do you know what IDS is? IDS.
Intrusion Detection System.
Yes intrusion detection system. KMS?
Key management system.
Key management system, okay okay. Nice TDE? TDE? Transparent Database Encryption. Okay, okay okay okay. Well, everything else is quite easy. AAA? Authentication Authorization and Accounting. Good, good enough.
So we already know a lot of security controls, whether I didn’t have a lot of tools. When we put all these tools in our typical distributed web application, we can have something like that. We can have means of active protection. Like firewalls, and again we have different kind of firewalls, we have WAF, web application firewall. We have data firewall, we have network firewall, right? Then of course we use access control, access restriction for data, we use ACL for data in databases. We use secure coding practices for our application side. We use dependency management to check our packages, right? Then of course, all the actions are being covered by logs and we have something to monitor the logs, CM systems, security information and event management system, and then we have something to monitor the monitoring system. Security.
And probably we have some backups, and if we are lucky enough, the backups are working, and if we are lucky enough the backups are encrypted, because it’s a really good attack vector. And all the things, they are going to work, until they are broken. And usually these systems, they are messy, they are [inaudible 00:18:11]. Usually we have a large company or we have different people, different teams responsible for each of the things and they are not aligned. Sometimes this means of security, this means of protection are called checklist protection or checklist security. We call them band, oh sorry. Band-Aid security model. Go back, go back. Animation, animation, yes. I was working hard, okay?
We call them Band-aid security model, and developers often tend to think about them as about perimeters. Like, perimeter, perimeter, and as a perimeter. However, attackers, they think in graphs and they construct a vulnerability path, and then move from perimeter to perimeter, and eventually they will find vulnerability in each of these perimeters. This is how attack graph looks like. This is an example, this is a really old picture, example of some kind of infrastructure. We see firewalls, here ideas, and example of attack graph, and there’s vulnerabilities in each of these components.
Right? So let’s get back to our system. We have already tons of security controls here, but it still holds a lot of risks and potential attacks. Application injections, backend vulnerabilities, remember all those low GS models that are not covered by monitoring or we don’t look, we don’t read the results of monitoring. Mis-configured database access, my favorite, AWS S3 buckets that are left public, because we’re not, right? Even if we have this encryption checkbox.
Then we can use TLS, fine. TLS, ultimate protection measure. So we can use TLS, but if we obtain certificate from unreliable certificate authority, or if we have weak ciphers, if we use TLS but we don’t use a certificate [inaudible 00:20:38], right? Then, if we have our backups, but they are plain text, again, sorry no sorry. If they have a lot of logs, but at some point we have this log overflow. Cool to have logs, but bad if we don’t pay any attention on that.
So, defense in depth says that, well obviously one security control is not enough okay. But even when you have many of them, if they are chaotic, it doesn’t make a lot of sense. You will get better results for each group of risks you build independent but interconnect it. Set of security controls.
How to build that. First of all, again we are talking about data protection, so on data level we understand data flow, how our data’s being generated, processed and where it’s stored. It’s nice to have security controls that protect the data globally during the whole data flow. Then, we find high attack vectors, risky pathways in our infrastructure and we build interconnected, overlapped security controls for these high risks, places of high risks. So we have global control, and then we have set of overlapped controls for some points. Which global control, which control, which tool, which method, which mechanism we can use to protect the dat during the whole data flow?
The first letter of the answer is ‘C’.
The second letter is ‘R’.
Crypto, cryptography, yes. We use cryptography because if data is encrypted it can’t be accidentally, you know, suddenly decrypted. No, there is something, there is place in the code there is some service that will decrypt the data. And if we have data encrypted during the whole data flow, like field level encryption, data blobs level encryption, we can monitor the places where the data is decrypted. Then cryptography, good implemented cryptography allows you to narrow down the attack surface and makes your data flow predictable. It’s easier to monitor anomalies when you understand what’s going on. Actually we build our tools around this idea, that we use cryptography as a global layer of defense and then we put other security level of control like traditional security controls, to double, to triple protect the bottlenecks.
Depending on the infrastructure you can use different kind of cryptographic tools. So if you use, if you have like mobile of IoT based, IoT forced infrastructure you can build end to end encryption there. Unfortunately if you have large web distributed tabs problem to encryption, it’s not first thing to do for you. So for them, I’ll discuss for you right now another approach called decryption proxy. It’s easy pattern, I don’t want to explain the pattern from architecture perspective, let’s jump to infrastructure. Decryption proxy says that there is some site that encrypts data and there is another site, decryption proxy, that decrypts data. So you kind of separate encryption and decryption.
Why? Why, what’s bad, what’s wrong to encrypt and decrypt data at the same component? If we use simple, if we say simple, if we use simple symmetric encryption, we have one key to encrypt and decrypt data. What’s wrong with encrypting and decrypting on the same component?
Yes, yes. This is why database level encryption doesn’t make a lot of sense, because anyone who has access to the database server has access to the key. There is nothing wrong to have encryption decryption on the same components, but the kind that add more trust to the infrastructure if we separate them. So we can do encryption on the client side but decryption on the trusted zone, the trusted environment that we monitor. That’s easy.
This make the data flow very predictable, right? We have one way, we have writing data or encrypting data and we have another path, where we read data or decrypt data. Again, predictable things are good for security because it’s more easier to analyze anomalies and to understand what’s going on.
I know, I know, it’s may be a little bit high level, and let’s get back to our super cool infrastructure. So, how to build decryption proxy in this data flow.
We actually, we separated our application to two different components, different sources. The application backend, itself will make client side encryption, okay. I’m from tradition cryptography, I don’t believe in cryptography in the browsers yet. From cryptographic perspective, client side is actually application side. So we will have client side in our application backend and we will have decryption in a separate component, in a separate service, decryption proxy.
When data, as we’re writing data flow, the data is being generated somewhere, it gets to application backend, it’s been encrypted and sent encrypted in our infrastructure and is stored encrypted in the database.
To read data from the database, we need to make read requests through decryption proxy which obviously works like proxy, it redirects requests to the database, it gets encrypted response, it decrypts the data and redirects response back to the client side.
What is fun here, the key model. We don’t use one key anymore. We use public key to encrypt the data and obviously application backend doesn’t have a private key decrypted to decrypt the data, only public. We use private key to decrypt the data. And of course we store private keys in a very trusted zone, in some HSM, in you [inaudible 00:28:01] it all depends.
And as soon as application backend and database they don’t have private key, they can’t accidentally decrypt the data.
Technically same, as key model is a little bit more complex. This is key model from the software we use. So it’s more than one key, and of course you can have private keys, plain text as well, so you can encrypt private keys as well. I can elaborate on that or I can skip the slide and continue without digging through multi-layer encryption. Okay, I see your faces okay.
If you’re really curious in terms of ciphers, in terms of encryption, how it works, let’s just discuss it after session. Let’s move forward.
When we separated encryption and decryption we make it more complicated to break our system, to compromise our system. Because database actually doesn’t know what is the data which is being encrypted, database doesn’t have any keys, doesn’t have any clue. Breaks database, all fine. Application doesn’t know a why to decrypt the data so maximum you can get data generated on this app like it’s own data if you break the app.
However defense in depth encryption itself is not enough, and let’s get back to the world of this, all the shiny security control we know, and add them into our database. Because cryptography was first, globally our defense. We need something, we need to double and to triple the defenses. What we can have here.
First of all, yeah, TLS nice. Then on our decryption proxy, if you think about it, it’s only point that decrypts the data, so on this point we can have some authentication. Decryption proxy has request form client side and it helps indicate does this client has right to access this data, no? Done, SQL firewall. If we’re talking about SQL database of course.
We got some request that asks us to select all from database. Does make sense? No, stop request. Then, we use which is called compartmentalization, like different zones, different keys for different clients for different users maybe. So when we get request from a client with these keys, please decrypt my data with these keys, the decryption proxy checks, okay, probably the keys are wrong, you can’t ask to decrypt data from other user if you have these keys, stop.
Then, which is also useful, honey pots, or honey tokens as we call them. So in our encrypted, in our database inside encrypted records it can have a special markers, a special records, that look completely the same, because it’s encrypted right, it doesn’t make any sense for humans for attackers. So records that look competently the same but these records should not be read by this client. So when proxy makes request to the database and gets in database response, gets data with these markers, it means that actually someone already broke our SQL firewall and this is not a supervised request actually, and the decryption proxy can stop this request. So, all these things are traditional security controls, not cryptography. Traditional security controls we can put into our infrastructure to help cryptography. Because cryptography, is not a magic wand it’s not like that. We need to do all the traditional stuff as well.
Got it right? Intrusion detection, firewalls, KMS, all different transport live encryption.
Let’s go back, yes. And the third point is very important that the data is being watched, as soon as we understand the data flow, we can build these monitoring tools and audit logs for example. We can write to the logs that the data was accessed by user X with key something something so if in case of incident we have audit logs as well. And when you’re building a system which has cryptography as first layer of defense, you can actually have a lot of useful data for your CM’s, for your monitoring tools. If you see some peak of decryption operation in your CM, that’s probably anomalies, that’s probably someone has broke through your system of defenses, and got a lot of data and start decrypting this data, so you have tools to understand what’s going on here.
And as I mentioned previously, it’s gets a little bit complicated, just a little bit complicated to compromise the whole system because the only valid way to get decrypted data, to get data in plain text is to read it through decryption proxy, otherwise you need to compromise the backend application to go through the SQL firewall, through internal detection system to handle all the different keys for the different users, and then of course, not to leave anything in logs, and not to trigger any monitoring anomaly detection events.
So this is called defense in depth and these are layers of defense around data which are discussed from the data itself, we use an encryption of data pieces for separate data pieces, just a simple, in our case for example we use ISGCM. So simple, traditional, strong encryption. Then we use a different keys for different clients which actually works as authentication, as access control, because you can’t decrypt data if you don’t have key for that. Then of course we still have this transport layer encryption TLS, or in the case we not linked to TLS we can have new modern cryptographic protocols like leap signal for example.
And then there’s the monitoring layer. For me it looks really similar to this one. Do you know what this is? Yes. You see? PM.
Yeah, this is a Trans-Atlantic cable. The cable itself, this is it’s core, optical core, and all other things are layers of defense. And the last one is, the last cover is actually a shark protection, because you know sharks, they like cables. Okay, okay okay okay.
One more point about defense in depth, how to build, how to start thinking in terms of defense in depth for infrastructure. Try to think that your data is your money and you’re a bank. So, as a bank you save your money somewhere, in a bank vault, and for us it’s for example can be AWS bucket, S3 bucket, this is your vault, and you have this file protection encryption, like file level encryption, back S3 bucket encryption, cool, this kind of door of your vault. And usually if you think about bank robberies, not so many people try to steal data from storage, from vault.
Either you are insiders and you have keys, or usually you don’t, you understand that the means of protection are quite high. How people usually steal money, file transport, file transportation. That’s why when you get your data from your S3 bucket and try to process it to calculate something, you need to build all the defenses around your data, because similar to defenses people build to transport money, they have special armored cars, on these cars they don’t have external locks to open doors from external side, then they have trained people, officers with guns, they have special bags to put money in. Of course, they have this radio to communicate with each other, and they know the route so someone is watching them. And you can think about these as about layers of defense around your data. Of course, if application you make operates with sensitive data, visible data, because if you create Instagram for kittens I understand, these things are probably not for you, okay, okay, okay.
So again defense in depth is how to build system with defense in depth approach mind, you need to have one global security control with is cryptography, cryptography is really cool for that. And then you create independent but overlapping set of traditional security controls. Kittens. Kittens in tin foil hats.
How to build it. How to build this kind of system. Of course you can build it on your own, I mean you are cool engineers who can build those things from scratch. Just one small advice, start from security design, start from trust model, risk model and potential attacks to understand the data flow of a system. Then you can use box solutions for example if you use Oracle, somebody is using Oracle here? So, no okay.
In the case you’re using tools from one large vendor, usually those vendors already have a set of tools that you can use paying them. For example, Oracle suite, not only is it a database itself, Oracle has TD, transparent database encryption and Oracle has SQL firewall, so if you stick to one vendor, probably vendor already has some kind of things. If you don’t, fortunately there a lot of open source tools available, the one I showed, the orange one is Acra, you can use it as decryption proxy, as something that will decrypt and encrypt your data with of course client side libraries to encrypt data.
If you don’t want to use Acra you can use, for example, Green SQL, it’s a SQL firewall, then libsodium is a cryptographic library and built in this cryptographic library into your own decryption proxy. Depends on your level, how many things you want to do yourself. You can use separate IDS detection system, well CM are usually separate tools. All the things, most open source tools, you can use them, if you use Docker, if you use Cooper Net is more things already there. The one I showed you is Acra, open source, free and if you use Docker, it’s compatible with Docker it’s compatible with Digital Ocean, and if you want to, yes. Okay, awesome.
If you build the system, like these kind of system which are discussed, you can already can cover these things from the Top-10. Okay. Okay, a video of a kitty, you survived, you survived the cryptographic talk.
These are key points. Why we are talking about data security, we are talking about money, that we as a company can lose. Why? Because regulations are pushing the market, because of the huge fees, or the huge fines, because regulations are pushing on service providers, like Google Apple Amazon, and we use services from those providers and I predict that those providers will push even more on companies that use them in terms of data protection, security assets, yada yada.
Then, defense in depth itself, it’s not something new, but when we are talking about defense in depth for data we are talking about building a system that watch data, like different entities with different layers. And these layers, they are not only chaotic, because we already know a lot of security controls, they are not only chaotic but they are actually connected with each other, then cryptography is a good thing to start because when you have encrypted data it can be suddenly decrypted and you can watch the places where you decrypt data. And it’s 2019 already, there is an app for that, there is a tool for that, most likely you’ll find out that there are tools from vendors, there are open source tools and you can try, them, you can use them, you can build them into your systems.
And I know, I know, I know you like security a lot and you like reading about security a lot, so I prepared a list of articles for you, so you can read them on long, long evenings, why not read about security. Excellent about defense in depth, about backend security, database security, yada yada. And of course you can follow me on twitter, yes. IF you are curious, on different cryptographic tools check out our slides, our site. I think that’s all. Yeah, I think that’s the last slide I have, and perfect timing. So thank you for the attention, thank your. Yeah, yeah, I know I know, I’m around a little bit. Now we need to go to the separate area for discussion, right?
Right, that’s really important. On your tables, you can find these old school physical cards, and you can put your feedback on those cards which will help JFrog to build better conferences and maybe to invite me more, it’s big. Yeah. Trust, if it’s plain text don’t put your sensitive data there. All encrypted, all encrypted. Okay, so if you want to talk about cryptography, database encryption or searchable encryption, my topic, I really like, searchable encryption, how to have data encrypted but still searchable, just follow to the discussion area, somewhere here, I have no clue, okay, okay, okay.
Thank you. Oh, second round of applause, yes