This webcast was originally published September 4, 2019.
In this video, the speakers discuss the powerful capabilities of the ELK stack (Elasticsearch, Logstash, and Kibana) for handling and analyzing Sysmon logs to improve cybersecurity measures. They explore how to set up and configure the ELK stack to ingest, parse, and visualize important security data from Sysmon, providing insights into the potential applications for enhancing incident response and threat detection. The video also covers the integration of Sysmon with ELK to leverage advanced logging and monitoring techniques for better security analytics.
- Sysmon and AppLocker are effective tools for enhancing security through monitoring and application whitelisting.
- ELK stack (Elasticsearch, Logstash, Kibana) is increasingly popular for its powerful capabilities in log analysis and security information and event management (SIEM).
- Backdoors and Breaches game facilitates hands-on incident response training within organizations.
Highlights
Full Video
Transcript
John
All right, so let’s get started, everyone. This is a continuation of the webcast that you all asked for. So, if you don’t like it, it’s all your fault. No, previous webcast, we spent some time talking about Sysmon and and setting up app blocker for application whitelisting.
And that was a good webcast to get those basics and those fundamentals out of the way and really kind of reverting back to that concept of basics and fundamentals with, sysmon.
Exactly. Getting that sysmon into a format or into a place where it provides value. And that’s really where elk comes into play. And we’ll talk about elasticsearch, logstash, and Kibana here in a couple of seconds.
We’ll kind of set the stage of the entire webcast. But it’s all about trying to get that really great quality data of Sysmon into a SIM or Siem, however you want to pronounce it, and getting it to where you can actually do something with it.
And traditionally, all the cool kids go through these weird kind of cycles in computer security. For a while, the cool kids were doing nothing but sat in oxscripts parthync through syslog.
Then the cool kids started getting into actual sim technology, maybe Arcsight and then Splunk took off and everyone was doing. But now if you’re looking at the cool kids, they’re using elk stacks, for doing their sim and their analysis.
And there’s a number of really good reasons why Elk is so incredibly powerful, for what we’re trying to do whenever it comes to log analysis. However, whenever we’re looking at this, we got to do a couple of housekeeping things before we get started.
First, this is brought to you by backdoors and breaches and with backdoors and breaches, we have a whole game that’s designed around incident response and kind of creating that tabletop exercise to really facilitate the conversation of IR in your organization.
So you have red cards, which are initial attack and compromise. The brown cards are c two and Xville getting data out of an environment. The purple cards are the cards for, persistence.
And then the yellow cards that you see there with the bloodhound logo are, lateral movement in an environment. So that builds the incident, and then you work the incident with your blue cards, which is your procedures.
And then we insert injects, which are the gray cards periodically through the game to keep the game interesting. So if you’re interested, just, type backdoors and breaches into the questions windows.
And we will try to get people a deck before the official, official launch. I don’t know what the official launch is now that they went to the printer and they’re going to be available at Derbycon. If you’re there at Derbycon, you got to come down.
We’re bringing a thousand decks and I know that sounds like a lot, but I’m willing to bet that they’re going to be out very, very, very quickly. So come check it out. Also, it’s brought to you by Black Hills information security.
We do penetration testing. As much as we joke that we are an education company, gives free webcasts, really that is completely funded by penetration testing, open source tool development, red teaming, threat hunting.
We also now do incident response. If you’re in the middle of an incident, you want some help or somebody just give you a hug, shoot us an email, come to our website at bhis co. Or shoot us an email at consultinglike Hills information security.
It’s also brought to you by AI Hunter. AI Hunter stands for actual intelligence Hunter. It’s designed to identify signal and the noise when it comes to network traffic, identifying beaconing activity, threat intelligence feeds, correlating all of that in a way that gives you meaningful insight into your egress traffic that’s leaving your environment.
If you want a demo with me or Chris Brenton, all you have to do is just type in demo in the questions window and you will get a personal invite from myself and Chris Breton and we will set up a demo of AI hunter and show you why it’s cool.
All right, while Wes hacking Fest is coming up, the end of October, please check it out. I don’t want to say we’re close to selling out because that sounds really kind of pushy from a used car salesman perspective.
But, let’s just put it this way. The state of South Dakota can only handle so many people in this entire state at one time. And we are getting close to that hard limit whenever it comes to the state of South Dakota.
And I joke about that, but it’s true. in the past we’ve had trouble getting rental cars, enough rental cars up here and it’s awesome. But yes, get here early, get yourself a hotel room, get a rental car, and hopefully I’ll see you at the end of October.
So, problem statement. Keep coming back to this slide for the past three webcasts when we’re talking about the, Mitre, ATT and ck technique matrix, and I feel like this became a solution in search of a problem, or maybe it’s a problem in search of solutions.
But I think that the mitre, ATT and CK technique matrix really under, uncovered just how bad our ability to actually detect a large number of different attacks actually is, or even better yet, even stopping a number of those attacks.
So in some of our webcasts that we’ve been doing over the past few weeks, we’ve been really focusing on trying to shut down as many of these different capabilities an attacker has as possible. So when we’re talking about app Locker, app Locker is really effective at stopping this initial access, this execution, maybe even the persistent side of it, privilege escalation, but trying to restrict what it is an attacker can do.
Now, I know that there’s people on this webcast that will ultimately say, well, you can always get by up, like, are you using these following techniques? And that’s how absolutely true. That’s what we do for a living. But if we can severely limit the options that are available to an attacker, then we’re actually doing better.
And as I said in the app locker presentation we did last week, if you gave me a choice of just blacklist av and app locker, I would take app blocker any day of the week.
So we’ve got a lot of it for trying to stop a lot of these attacks. But then we discussed Sysmon and how Sysmon can be used to detect a large number of these attacks. Before then we get into an issue of how do we do that at scale.
Talking about Sysmon on one off situations is an absolutely wonderful thing to discuss because it’s a really cool tool and it’s super easy to actually implement. But how do we get Sysmon to the point where we can actually get it loaded into a sim where we can actually do some effective analysis on that?
So the executive problem statement is this. Is our logging actually working? One of the problems I hear from many executives is we spend hundreds of thousands, if not millions of dollars on SIEM.
Is the SIEM actually working in our environment? And many times this question is answered anytime they work with a penetration testing firm. And that’s pretty much a resounding nope.
we very rarely are detected at BHIs, where someone says, hey, we got you in the sim. And we’re like, oh, drats, there’s nothing we could have done about that. almost always most organizations will have to go through after the fact and identify what we actually did.
But the actual alerts and the full breakdown of what we did and how we did it is actually fundamentally lacking. Are we logging too much? The answer to that question is a solid yes. You are, in fact, logging far more than you probably need to be logging.
One of the questions I put forth in almost every single webcast that I talk about logging is if you look at the total percentage of your logs, what percentage of those logs have any alerts or reporting associated with them, and you’re probably going to find it’s greater than 90% of the logs are just white noise that is being loaded into a back end, a data lake, something like that.
And it is basically just taking up space and you’ll never do anything with it. Now, what is a data lake? Well, when we’re talking about event logging, right now we have two problems.
One, the amount of data that’s coming in is huge, and then we have a lot of data that we then sidechain off to other things. And, like Data Lake would be a secondary storage for it.
What is my team actually doing? I get those questions a lot from executives after a penetration tester or, goes in and breaks their environment. They say, we spent all this money on the SIM technology and they can’t identify what the testers actually did.
And that is a trap as an executive because it’s easy to immediately blame the security team when I think it’s probably better to blame the tools and the telemetry in an organization.
Now, the last webcast I did a bit of a challenge. It was basically, we pretended that the executive in this screen was playing a video game.
And I asked you what video game was he playing? Now, last week it was, Roger Wilko and the time Rippers. it was basically a video game that our executive was playing. This time I was going to make it a little bit more difficult because I didn’t want it to be something that you could immediately google and go right into the answer.
so this is, I am a youngblood, and I will avenge my fallen classmates. So what video game am I discussing discussing here? What game is that executive actually playing?
So we’ll see who got the right answer, and of course, we’ll give you a special prize if you’re the first person to answer that one correctly. So let’s move on.
All right, sysmon. once again, windows logging is just bad, like really, really horrifically horrible. now, in a couple of webcasts down the line, Nick Douglas is going to come on.
we’re going to get better safety net, a better digital safety net on and he’s going to talk more about what we can actually do effectively with elk. So we’re going to talk about just getting logs into Elk in this webcast.
Talk about the importance of elk, what it actually is, why it’s something that you want to look into. But we’re going to discuss this problem a little bit more. We discussed ingest. What is the data that we’re pulling in Windows event logging is just horrific.
Let me show you a little bit of like pro and cons. So let’s go here to our chicken of the VNC and then I switch over to my computer.
So if we go in to sysmon, you can see that we have this wonderful little alert that basically says network connection was detected.
This is the executable, this is the user account protocol was tcp. great. So we’re able to find our malware relatively easily. Let me see if I can open up.
No, no, no, wrong one. Let’s see if we can open up another event viewer. I don’t know if I can do that. I want to keep that one up. perfect.
If we go to the Windows event logs and we go to the security event logs and we start going through these various security logs, we don’t get any of that level of detail. What is the executable?
What is the network that was actually being made? Who started this executable? Was there a parent process? Is there a parent process? It’s just not as nice.
we can even go to system and application logs and we really just aren’t getting the whole story. With many events on windows you actually have to go through and you have to look at multiple different event logs to be able to piece together exactly what happened.
Now with Sysmon, we had discussed this a little bit in last week’s presentation with Sysmon. This gives us the direct details like we can see an event id of three if I scroll up here is a network connection.
event id of one means that there was a process that was created. Eleven means that there was a file that was created. So we get these very, very clean, clean, clean alerts out of Sysmon and they’ve recently added in the ability to do DNS logging in Sysmon.
And all of this is great. But if we go back to the problem statement here, let me go back to the event viewer and open up the security events.
We’re looking at Windows event logs to actually parse what’s going on. You will get some information directly here. It’ll be like in the tag subject, and then it’ll be the privileges associated with it.
And then you have additional details in details. And many times this is not something that’s easily parsable. Now, that’s usually where a large amount of money goes to people like arcsight and splunk to actually parse this properly.
But it is not the easiest thing in the world to do. So what we’re going to do is we’re going to talk about how we can get the better logs, that is the sysmon logs. And to be honest, even the windows event logs, into an elk stack where we can do better analysis of them on our system.
I’m going to set that to the side for just a little bit. We’re going to go back to our webcast slides here. So sysmon, as I mentioned. Oh yeah. Did somebody have a question just real quick?
CJ
Well, when you have the small font on screen, if you could magnify a bit, you got a request.
John
I thought I was magnifying. That magnifying didn’t come through.
CJ
one of them was pretty small.
John
I don’t know. Just take a look. Well, we’ll get into the fine print here in just a little bit. I just wanted to compare and contrast the difference between the two types of logging. One doesn’t give us a lot of context from a security perspective, and the other one gives us a lot of security context.
So don’t worry too much about it. We’re going to drill in a lot more detail. And also in the last webcast, we demonstrated how you can set up sysmon in like five minutes and get it running. So now elk, now I say something different.
There’s a lot of people that have been working with elastisearch, logstash and kibana for a long, long, long time. And that is great. this webcast is probably not for a lot of the people that I’ve been using out for a long time, but maybe Winlogbeat and the sysmon will provide value.
But I really want to focus on, for the people that are new, why this is something that’s fundamentally different from an open source perspective, and why it’s something that they should be trying to at least test in their environment to figure out what’s going on.
So we’ll be talking about a number of them. Like soft elk is the one with the cool elk with the, the night vision goggles. And HelC is from the fantastic team at, Spectrops.
They’ve been working on an alpha product and that’s the one that I’m going to be demonstrating tonight because I really like where that project’s going. It’s still very much alpha, but it’s super cool. So anytime you hear someone talking about throwing their logs into elk, they’re actually throwing it into three separate things.
You got elastisearch, logstash and kibana. And each one of these different components, they is able to be brought together and provide a really wonderful logging solution that provides scalability, extensibility, visualizations and really cool search capability and even alert capability.
So if you look at the HElc picture that has the really cool elk, that’s like the Robinhood elk, you can have win log beats, that is from elastisearch that actually pull event logs and allows you to dump it into log stash which allows you to pull it into elastisearch.
Then you can have visualizations with Kibana and you can have alerting with last alert. We’ll also talk a little bit about SGMA. Now there’s a bunch of other things in here like other data provisioning tools and data analysis tools like Hadoop and then also spark.
We’re getting into some artificial intelligence and some baselining in the environment that we might talk about later. But needless to say, the team at Spectreops has done a fantastic job just getting the, the base level of an elk stack that is very quick to set up, talking about a matter of moments to get set up and get running in an environment.
In fact, it probably will take you less than ten minutes to get help set up from start to finish. The longest thing would probably be the install process with using just the standard repositories.
So let’s go through and take apart each one of these and let’s set up a problem statement and why this particular component is good. So why elk? Let’s talk about data for a couple minutes.
This is actually critical because if you’re looking at a traditional sim and you’re trying to utilize a traditional relational database, you’re going to run into some issues. So let me explain.
So with a traditional database you have columns and you have rows and everyone’s very, very familiar with that kind of excel spreadsheet type of analysis. right.
So you receive structured data, you put it into these columns and well defined rows and then you have an effective date, you have the primary keys and you’re doing the correlation of that database.
That is amazing. If you’re working with highly structured data that has very clean inputs and then very specific outputs that you’re trying to work with. When you’re dealing with logs though, you’re dealing with something closer to documents.
So instead of thinking of highly structured data, I want you to think more of like Google. Google is crawling the entire Internet constantly and it’s basically indexing a whole bunch of quote unquote documents, HTML files, the actual text associated with that HTML.
And then it allows you to correlate and then do some analysis and some processing on the backend as far as the relevance of these different web pages to your specific searches. But it’s a lot of different types of data, from Excel spreadsheets to PDF’s to web pages and so on.
And Google is able to correlate all of that basically by using the power of unlimited computing power. More money than God. So how does that problem of documents actually apply to event logs?
It actually applies quite a bit when you’re looking at event logs. If we go back to let’s just using for example Windows event logs. So if we look at Windows event logs that I have to stop closing this, we’re kind of dealing with a document and this data can get dumped into the elastaseearch database and it can basically collect this entire record of the data that’s been provided to it.
And this becomes something that is a searchable document within that elastisearch instance. Well, how this actually applies to the rest of logging is imagine you’re now pulling sysmon, imagine you’re now pulling Netflow data.
Let’s imagine that you’re pulling Windows events and you’re pulling syslog events and you have all these different types of data sources with different formats that are now being pulled together into a centralized repository.
If you were working on a standard structured database, the performance improvements that you would get on that actual database probably start falling apart because you would actually be putting a tremendous amount of data into specific values within records and columns within that database itself.
So the ability to actually go through an index database probably would start falling apart, especially whenever you’re trying to correlate and fuse that data across multiple different data points.
So when you’re dealing with logs, you have a lot of different formats coming in and you want to be able to index and search across all of those different documents. And that’s really where elastisearch shines. So whenever you hear people saying that elastisearch is just better than a database, they’re full of crap.
They honestly don’t know what they’re talking about. They mean that elastisearch is very effective for specific types of data. It’s about using the right tool for the right job. So, for event logging, you have the ability to pull this data in and be able to index it as a variety of different records with different values within those records or those specific documents.
And then you can easily search it and then correlate it across a whole bunch of other different types of disparate log sources and do so relatively quickly. And that’s the part that’s super important and wicked cool when you’re talking about elastisearch.
So don’t think that elastisearch is just better than anything for data. It’s not. We’ve tipped over elastisearch instances all the time. In fact, when we first started with Rita, Rita was originally built on an elk stack.
We were ingesting log data, that is network traffic. And then we were doing our processing, and elastisearch was falling over. Now, I know there’s people that are like, well, you can do clustering and you can scale it.
You get into problems with sharding. Whenever you start setting up clusters, especially whenever you’re doing select star queries, it gets ugly.
CJ
All right, I got your questions, John.
John
All right, so we got some questions. Go ahead and throw some over.
CJ
Want to ask about getting a copy of the event logs that you’re using, I guess, to, try to run through as a sample?
John
Yeah, I could absolutely do that. I could export those and share them. Sure. Why not? Because that’s not all that impressive.
CJ
Yeah. is sysmon safe to use on both servers and workstations?
John
yes, it’s by Microsoft. For Microsoft. Mark Krasanovich wrote it and all. Hell, Mark Krasanovich. And don’t ever cross it. you definitely want to do some filtering. Once again, I’ve ran the swift on security, filtering for the configuration of Sysmon, and it cleans it up.
However, I’ve learned recently that if you’re using Swift on security sysmon configs, you also want to add in some rules to add in LSAs local security authority subsystem events, to detect some more attacks.
But that’s a whole nother conversation. But a very good question.
CJ
Only use it on lightly loaded servers. Is there a heavy load with it?
John
I don’t believe so. I’ve seen, some of our customers run it on absolutely everything across their entire environment by some. I would say I’m seeing it probably in about 15% to 20% of the people I have ions calls with and my sans students as well.
CJ
Excellent. All right, that’s it.
John
All right, cool. Now let’s talk about logs, because I mentioned you got elastisearch, which is great for the documents that come in, and then you can do that analysis, that indexing and that searching of those documents.
But there’s a whole nother issue. This particular issue is all the different types of logs. So Sysmon is probably, or excuse me, Syslog is probably the defacto standard.
Everyone should do Syslog because that’s what we came up with back in the 1860s or whatever. However, a number of vendors basically looked at syslog and they’re like, yes, this is actually a defined standard.
It makes sense we should be using it. Screw that. Let’s come up with our own, and I’m blaming Microsoft for this. Microsoft came up with their own standard for generating their event logs and then trying to get vendor logs.
We’ll talk about, Windows event forwarding here in a couple of seconds. They don’t speak syslog. Then you get Netflow. Cisco sits down and says, well, yes, that whole Syslog is kind of cool, but we want to come up with Netflow.
So we’re going to come up with Netflow and our own little proprietary standard, talking about how it is that we’re going to process and give data about network connections outbound. So you have all of these different types of logs.
It could be pushed in a variety of different formats. so enter logstash. So logstash allows you to listen for those logs coming in. And then logstash can receive those logs, database logs, telemetry logs, web logs, documents, whatever.
And then it can take those logs, and then it can drop them into something like elastisearch. And then elastisearch will actually archive it and index it properly.
So logstash is the super cool like glue that ties your devices and your systems together and allows you to dump it into your elastisearch instance, where it can then be easily indexed and analyzed.
I can’t stress just how cool this actually is that we have something like log stash, because, logging in and of itself is a nightmare for a long time. If you wanted to have multiple different ingesters, you would have to be buying usually a tool like splunk or arcsight to be able to receive a lot of those different data feeds coming in.
Now, Kibana with Kibana, it’s the visualization engine. when most people are taking screenshots of their elk instance, it is almost always their elk instance, being shown through the eyes of Kibana.
And to be honest, that’s what I’m going to be showing here in a little bit is the kibana interface, a bunch of really crazy cool dashboards. I’m showing off the awesome soft elk dashboard, from the sans deeper team, Phil Hagen and a bunch of people work with that.
Justin also works, the author of 555 at, sans works very heavily in solfalk as well. But the ability to create and save dashboards and be able to have those searches indexed and then be able to pull them up is really just something to behold.
When, you’re talking about indexing, you can talk about indexing in two different ways. You’ll hear indexing used in conjunction with elastisearch for creating indexes of the data. But then you can also create indexes, for Kibana as well, for, the visualization.
Those are two different things. But as I said, our very own Mick Douglas from, the Sans institute will be coming in to a webcast and talking about some really cool kung fu where you can get some really awesome data, especially on the visualization side, utilizing, Kibana.
So whenever we discuss an elk stack, that’s what an elk stack is. That’s how an elk stack actually works. And that’s also why you need to be looking at elk stacks. it’s weird, I can’t speak to this for myself, but a number of people get to the end of what they can do with something like splunk, and then they move to elk to be able to handle more data and get the types of visualizations they’re looking for.
So if you’re ever getting into a question of how does the scale, it scales really well, and there’s a lot of people that scale it magically. Now I’ve made this sound like sunshine and roses.
I need to drop a little bit of reality on it. When you’re looking at things like soft elk, when you’re looking at things like Helc and Amazon has their own elk instance that they’re creating as well.
Problem is, these all shine in different ways, but then they start to fall apart in others and they kind of complement each other. So usually what I recommend, especially if you’re getting started with playing with elk, and I’ll give you all the steps that we need to get started getting set up with elk.
You’re going to have to dig and you’re going to have to play around with it a lot to get specifically what you want. Even though a lot of companies may start with something like helc or soft elk, they usually end up kind of creating their own elk instance with their own indexes, but they’re stealing information, or at least configs and ideas from these other platforms.
It is also not necessarily for the faint of heart just to jump in and start doing it on your own. It really, really helps to have those water wings to kind of get you started. And I really think that soft elk and help do an amazing job of that.
I love this book, but by elk. All right, so all of this can be only as good as the data that you’re actually feeding it, right? And this is really kind of the continuation of the previous webcast where we discussed Sysmon and the ability to actually integrate that sysmon data into something that’s more enterprise grade and actually pushing that data into something like, I don’t know, like elastisearch.
And we’ll talk about the tools that are available to make this very, very, very easy for you to stand up to do. To be honest, once you actually move from that traditional windows event logging and you actually move into sysmon logging, your ability to detect a vast majority of attacks is going to increase exponentially.
At least get a good understanding of what it’s doing. This also sets the stage, to be honest, and this is something that HALC is working on, of trying to create dashboards and creating feeds specifically around, detecting a variety of different attacks like the mitre, ATT and CK technique matrix, which ties it again to the beginning of this presentation, which is key, because if we’re just collecting data, that’s one thing, but if we’re actually generating powerful alerts on that data based on actual threat intelligence, now we’re actually getting something that’s useful.
Now we’re not just collecting and ingesting threat intelligence feed, we’re actually leveraging it against something that can actually generate alerts in our environment. And that’s really, really cool. And that’s ultimately what we should be getting.
So we want to get the Windows event logs that we want into our elk instance, and we want to get it to elastisearch, and we can do that very quickly with Winlogbeat. So today, for the kind of the presentation that I’m going to be running through, I wanted to walk through the steps that you could get set up and running with a full elk stack and be able to start ingesting those logs and be able to do it in such a way that it didn’t take you five days to do it or a day to do it.
I want you to be able to pull it off and get it done in probably less than an hour. I think, I think you could get it up and running in less than an hour. Now, HALC is still Alpha, and for that I say don’t use this for production ever.
And that doesn’t mean I’m digging on the Spectre ops team. They very, very clearly up front say this is still Alpha. There’s a lot of stuff that they’re currently working on. You’re going to have bugs that you’re going to deal with.
And there’s a whole bunch of cool features that I’m sure it works for their team, but maybe those features don’t work for the rest of us. as it stands right now, and that’s once again, not a dig on this team, the fact that they’re releasing this tool, creating this tool, and then allowing this tool to be shared with the public as they create it is amazing.
And to be honest, right off the gate, it’s actually pretty darn good. So we have data normalization. We’ll talk about this as far as ingesting log files here in just a couple, couple of seconds. As I said, I would not recommend this for production.
So let’s go through and let’s set this up. So I’m actually going to go to my Hulk server and I’m going to show you the specific, commands that I used to get it up and running.
All right, so this is the kibana interface. And you can see that I have logs that are coming in. I’ll come back to that here in just a couple of seconds. but we have some amazing things.
one is the install process is just stupid, stupid easy. let me see here. I don’t want to show that yet. So the only thing that you need to do to get it up and running is you just simply have to do a git, clone and pull down helc.
And when you do that, let me zoom in for you because I know some of you, this is going kind of hard to see, should be able to see that. And whenever you clone the help repository, it’s going to pull down everything that you need, to actually make help work.
the actual install file is going to be here in the docker directory and there’s this wonderful little script called helc install.
It’s up here. Install. Where is it? Did I pass it?
There it is. Helkinstall sh. That’s it. Now a couple of quick notes. One, ideally whenever you’re standing up your helc instance, you’re going to want to you’re going to want to stand it up with a static IP address because it’s going to ask what its IP address is.
It’s going to stay at that IP address and you’re going to have hosts sending data to that IDP address. You don’t want that to be a DHCP lease where it’s basically changing its ip, for so many hours or days.
That’s one of the great ways that things can break in such a way that you wouldn’t know. Also I want you to pay special close attention to the install requirements of the system. so the install requirements for the system, if you actually go to the page they tell you run a ubuntu 16.04 system, it’s strongly recommended.
You need to have the right amount of, in fact, we could go there but you need to have so much memory, I think it’s like eight gigs. And then you need to have so many cores. I believe that they recommend four cpu cores.
So give it the right memory, give it the right cpu cores, give it a defaulted boom to installation and then you’re off to the races. And then it just works. It’s going to ask you some question. It’s going to say what do you want to install by default?
It has the smallest amount of different utilities all the way to the rest of them. Install, just stick with the defaults. It’s going to ask you for a default password for your elastisearch instance. Kibana has some defaults in there.
You can leave the defaults or change them and then it just installs it and it just right out of the gate. It’s amazing for me to have anything where I can run it and then automatically just get what I need out of it.
Let me show you what it looks like. Whenever you do have what you need. This is it running. I’m going to go through what’s actually going on here. But it actually worked. I’m not joking. It probably took me about 30 minutes, especially since I didn’t follow the instructions as far as the base system requirements right out of the gate.
But it only took me about 30 minutes from the initial install of HElC to the point where I actually had data being dumped into Helc itself, which is really cool. You just don’t see that very often.
Now, on the windows side, you do have some additional configurations that are required to make it work. So if I go to my workstation here, the first thing I did was I downloaded Winlogonbeat.
Winlogbeat, sorry. And that’s just a nice little zip file that you can pull down that has everything you need to get Winlogbeat up and running. I extracted the zip file into my tools directory and I had to modify the winlogbeat YAML file.
Let me show you what changes I made to that file. Very few, actually. So I go to notepad and I look at the win log feed and I zoom in on this and I scroll up.
at the top you have the specific event logs that are going to be forwarded off of your Windows computer system into our HElC instance that we stood up. I can pull the application logs, I can pull the system logs and I can pull the security logs.
The big one that we’re looking at today are the Microsoft, Windows sysmon operational logs. Those are the logs that we’re going to be looking at more than anything. But I wanted to show you that you can pull the application system and security logs off of Windows system very, very, very quickly and super easily.
by the way, I didn’t make any of these changes. This is all the default setting out of the box for using Winlogbeat, getting the logs that I need out of this computer system.
The only really big thing that I had to make at the bottom is I scroll down to log stash. Actually, hold on, there was some elastisearch configs. it actually had output to elastisearch elasticsearch and it said localhost.
I had to comment those out at the top. And then I just uncommented the output logstash line right here. Uncommented that, and then I put in the host of my helc system right here.
So I just basically pulled that out and I say I want my logs to be set to 172 dot one six dot one two dot one four on port 50,000 or 5044.
That’s it. that’s the YAML file. That was literally the only configuration that I needed to do on Winlogbeat to get the logs off of this windows system and actually get them dumped into my HALC instance.
Now we had to go through the process of actually starting that. Now, some of the commands I no longer have in my history, for Winlogbeat you have to set up the actual ps one file to install the service for Winlogbeat.
And then you set the service name to Winlogbeat, setup to automatic and then start the service. And this is actually some instructions of silent break security. Had this blog, it was like Sysmon Elk, and I don’t know when log beat.
Oh my a fantastic blog. Some of the data on this blog post is a little bit out of date. I don’t fault them for that at all because there’s a bunch of things that are changing. But I actually got these commands from them to actually enable the winlogbeat to be running.
You have the configuration and you just edit the YAML file and you start it. That’s it. you would need to have sysmon enabled and running. Of course, I have that command up above.
That’s pretty easy. One final thing, when you’re looking at these log files and you’re dealing with YAML, spaces and tabs matter and I don’t want to get into this huge fight of spaces versus tabs, tabs versus spaces, because that’s just weird.
but if you go and you edit this, you really don’t want to be messing and taking hosts all the way to the left or changing the spacing of things because it’s expecting things with the proper spacing and the proper tabs and you will actually break.
You can actually start Winlogbeat with a minus C parameter and get a bit configured and you can get a test to see whether or not it would actually work properly. The other thing that I really like about Winlogbeat is this.
I’m huge on getting telemetry and getting information about what is actually happening on my computer systems. here is the log from Winlogbeat.
Let me kill this real quick so you can actually see what it looks like. If you do Winlogbeat, exe with minus e, you’ll be able to see it actually sending the data directly into our elk stack.
this is really key for me because I had a problem where my IP address shifted. Remember I told you about DHCP and my IP address shifted? Yeah, I fell prey to that and it was really cool because I was able quickly from this interface to see that it was unable to detect or reach the actual system that was reading in my data coming in from Winlock beat.
so it’s very quick and easy to actually troubleshoot. That only took a couple of minutes to find out that I had a problem. So you can see the data that’s actually being sent. It’s not giving you the raw data.
It’s basically saying how many different things were written and written up to the remote system. So very, very, very cool. Super easy to set up. I think I have the instructions for starting it right here.
so here’s what you do to actually run it. You can just do Winlogbeat, exe minus c, give it the config file, and then it goes and you’re ready to rock and roll.
So that’s the overall setup. I have sysmon running. I have my Winlogbeat running. It’s sending my data to the halk instance. And like I said, honestly folks, if you decide to set this up, it’ll only take you about 30 minutes tops to get this completely set up and ready to rock and roll.
And that’s just huge for me. so now I have these logs on my windows system, and if I look at the logs associated with running malware, this, is something I did in the previous webcast where I actually fired up some malware.
I’ve got a full session here, set up. I’ve got a meterpreter session set up on the system. Interact with that session, oops.
And I’m in. So we’ve got our meterpreter session. No real big black magic with this at all. It’s just standard ADHD instance, running metasploit and standing up a malicious website.
But I want to be able to go through and track what’s actually going on with it because that’s more interesting to me as a defender. As soon as I ran that malware, immediately a series of event logs were generated by Sysmon.
I can see those event logs, I can see the image, I can see the protocol, I can see the source ip address, destination ip address. That’s great. If I go up to the top and work through the, sysmon logs associated with this, I can get the command line file, I can get the process image name and the command line associated with the execution of this as well.
Now, the trick, as I mentioned, is we want to be able to see the same data and we want to see it in our help stack so we can actually get the full accounting of what happened.
Once again, I’m using Swift on security’s configuration for Sysmon because it’ll reduce a lot of the noise that’ll be generated on a computer system. so you aren’t looking at like, it’s almost like stack trace level information.
You’re just pulling down the specific executables, the network connections, the processes that are firing on this computer. Let’s go back. Yeah, go ahead. Quick question. It looks like security or sysmon logs were the only ones being forwarded.
actually, is that correct? That was incorrect. so if you actually look at the logs, the ones that are being forwarded, if we go back to the config, let me go up to the top.
You can see that we have the application logs, the system logs, the security logs, and the sysmon logs are all being sent to our Hulk instance.
Does that make sense? And does that answer the question? Yes. And then another question. Is there any alerting capability yet with Helc or would you need another tool? Yes, you actually have, elastisearch has the capability of generating alerts with it as well.
So you can do that once again, I wouldn’t recommend doing that with helc, folks. Halk is still very much alpha. you’re going to have to stand that up on your own, but you’ll see some of the things that they’re working on in the health project and you’ll see why I picked it for today.
All right, any other questions for you?
CJ
Yep.
John
Yeah.
CJ
Good question. On where you can filter events. Is there any point in there where you’re filtering or is it just when you query?
John
So you’re going to have to actually configure your event filtering before it actually gets here because the only thing that Winlogbeat is doing is pulling the logs directly out and then sending them so you can do the filtering directly on what types of event ids you’re going to pull for the application system and security event logs, you would basically do that filtering through group policy.
the sysmon filtering, once again, for the sysmon filtering. What I’m pulling here is if we actually go through and we look at the configuration, for that, there’s a lot of different things that are being filtered out, by the swift, on security configurations.
I thought I had that XML file in here somewhere. Right here, let me open this up. Believe this is swift on securities?
Yep. So this is the actual configuration for Sysmon that I’m using, and a lot of really cool notes in here. But once you get down into it, it has specifically what are the types of things that works like, I don’t want to see every single time the audio service starts up so we can filter all of those out.
You really don’t respect just how much Swifton security did, with filtering out windows, executables and processes and creating just this wonderful list to filter out the white noise until you actually start going through the configuration.
But this would be the configuration of the swift on security configs for Sysmon.
CJ
Is there a link anywhere in the presentation to show them where to go get that? Or should they just Google?
John
Yeah, I can just bring it up real quick. If somebody could share that, we can respond. It’s basically the GitHub repository, for swift on security. So let me do that.
CJ
Excellent.
John
It’s right here. Now in the previous webcast I had mentioned that you want to download this as a zip file and then pull it. Don’t just go and copy the text out because you got some carriage turn line feeds that’ll create issues.
So download the zip file, extract it and then use that instead. So in the new alpha version, z alpha version they’re adding in, he’s adding in a bunch of cool things for DNS logging to be enabled as well.
So any other questions?
CJ
Let’s see if I can ask this properly. There are a couple of questions about Winlog beat. does it read from the forwarded event logs?
John
Nope, that’s different. it’s actually pulling the logs directly off of the system. we’ll be talking about wef Windows event forwarding in a later webcast probably the one that micks going to come on.
And the reason why is with Windows event forwarding you take all of your Windows event logs, have them forwarded to a collector. A collector can receive all of those and then you can actually take actions based on event logs as they came in.
so there’s a bunch of cool things that you can do. Not quite a sim per se, but it is absolutely something that is very very very valuable in a Windows environment to have that type of Windows event forwarding.
And you’ll see I have a slide a bit later I’ll talk about Windows event forwarding as something that we’re going to talk talk about in later webcasts. Any other questions?
CJ
I think you got a few of them in one go there. So I’m doing my best to keep up with the questions.
John
All awesome. Now we’re looking at the Kibana dashboard and you saw earlier that I had this Msf exe that was being fired up and you can now see that, that sysmon log is now showing up in my elk instance.
So I can actually see that I can actually get the parent process started by explore which means the user most likely clicked on it and you can see the hash associated with it, the full file, where it was actually put on the file system.
And if you go to the little caret and you expand little caret it’s going to give you a whole bunch of data broken out and sorted. Now this is actually, believe it or not, this is magic because this means we can actually sort and we can search based on this data.
So if I wanted to like let’s go through and let’s pick one. Let’s actually pick the executable. So I could do process exe, right?
So with that, if we go back up here you can see like parent process name, parent process id process executable. You can actually now do a search across your entire elk instance, but by just looking for that.
So let’s say I was working an incident and I wanted to see if there were any other systems that were executing this exact same executable. I can actually put it up here in the search and I can do process exe which is the equivalent of equals and then I can give it that particular executable and then I select update and now it’s going to go through generate the query and it’s going to pull up that exact same executable again on the system.
So oh I got the carriage return line feeds, not carriage return. I didn’t escape out my slashes properly. Let’s go do process name instead because that’s a little bit faster than me trying to do escape characters.
Let’s go process name and we’ll do this. I know that there’s probably some people that are really looking at me whenever I ran that cringe. So now here we have anything with a process name MSF Exe.
Now we can see all of the individual events associated with that specific executable. So I want you to imagine this from an incident response scenario. You have an executable that is suspicious and you want to see is there any other system in the environment that’s actually running this exact same executable?
Well yeah, you can actually now quickly do that search and find every single host name that actually fired that particular executable. You can also get the timeline associated with when those executables themselves were fired.
And you can see that mine fired up right about noon. And then I also ran it earlier this morning right before everything blew up in my face and I had to recreate it again. You can see that it was actually fired off twice.
You got that type of visualization, what’s going on, but all these different tags, and this is important, all the different tags of information that we get in sysmon, it’s basically stuff that we can query.
If I look at the network rules, just imagine that you could do a query like you got an alert for something like Rita, and it says there’s a connection to a destination IP address. Well, now you can do a search very quickly and elastisearch for any other system connect to that particular IP address and if so, when.
so this is truly what we wanted out of a sim, to have that ability to actually dig deeper into what’s going on. Now, I do know if there’s anybody here from Spectre Ops that was associated with the HELC project, they’re probably like, wow, that’s really, really basic.
There’s so much more that HELC is trying to do and they would be absolutely correct. So when you’re looking at what they’re working on in the health project, there’s also a bunch of different dashboards that they’re working on.
so a dashboard is taking those different searches and then creating dashboards for a variety of different things. Like what are the different event ids? How many failed login connections do you have in an environment?
Is there anybody hammering in the construction zone below me? all of these things come to fruition when we’re looking at the different dashboards that exist. So we have the data being loaded into elk, we have dashboards that they’re creating and they’re actually creating quite a few.
Now, some of these dashboards didn’t work for me and I just chalked that up to the fact that this is being alpha. So one of the things that they’re trying to do is basically cross reference a bunch of attacks and actually have the mitre ATT and CK technique matrix inside of this elk instance.
And then you can automatically do lookups and alerts based on the Mitre ATT and CK framework. so that’s awesome, right? So we can dive in and we can just find anything associated with any of those, but at the moment, everything I fire, whenever I fire it off comes back with nothing at all.
That could be because the indexing isn’t set up properly. but this is just kind of showing where this overall project is going. Now, the other thing that I think is important with what this gives us, right, if we actually are working with an elk dashboard and we have these logs being generated and we have these logs being dumped in, is we also have the ability of starting to have conversations about things like sigma, which is not opening system because it’s clearly frozen now that’s great.
but sigma, if we just jump out, come on, come on back, I guess it’s not opening. but if we look at the SGMA project, I want to show you kind of why I’m starting to get super excited by all of this.
So if we go to SGMA event logs, this is a project I started up a while ago, a little while ago. And what they’re trying to do is establish generic signature formats for sims so that you can take these different event logs, these different DTEx and then you can convert it into elastisearch query splunk searches.
So you can have this unified format of taking something like the mitre ATT and Ck technique matrix and then loading that into elastisearch. And as they say right here, sigma is for log files, what snort was for network traffic and Yara is for files.
Once we get to that point, we can now ingest our data and we can process it and then we can kick it out into a variety of different queries and searches. Now we can start generating rules.
This becomes that promise of what we’ve been wanting to get, out of this stuff for a long, long, long time. Let’s go back to the slides here real quick.
I did want to show you guys the config files, for ingesting on health, but my system appears to have frozen on me, which is not cool.
All the rest of my system seem to be working just fine. We’ll come back to that, maybe. All right.
CJ
We didn’t have a query earlier about whether the gods were going to smile on you and it just got answered.
John
yeah, I’ve been trying to do too many demos lately in my webcasts, but I like the hands on stuff a little bit more. It’s a lot more fun. But yeah, my health instance, didn’t really like running inside of a vm, I believe I can’t click anything.
Seems like the whole thing is froze. But it got me to the majority of my demonstration, which is cool. So moving forward, well, there’s a bunch of things that we probably need to have follow on conversations for.
first and foremost we’re bringing in Mick to help out. He’s a 555 instructor along with Justin Henderson at the Sans Institute. That is a lot of this. If this really excites you and you like the possibility of what you can actually do with this, you do need to look at that class by Justin Henderson.
And when Mick teaches it, he does an amazing job. But there’s a lot of options. as far as moving forward. You can of course be using elk stacks. You can go and pull down greylog, which has a lot of kind of pre canned, kind of elk stuff and a whole bunch of other ingesters set up.
I didn’t talk about Greylog in this webcast because it leads into a commercial product. but I did want, at least you mentioned them as an awesome alternative if you just want to set it up and just start running.
Graylog is fantastic. Nx logs for getting windows logs off. And then at the bottom I have windows event forwarding. you can have windows event forwarding, forwarding your events to a master collector, and then you can generate a series of actions.
So there’s a bunch that we can move on to. And I think that this kind of shows that for the first time in a really, really long time as a defender, you have a lot of amazing options as it relates to, to, as it relates to actual event logging as well.
So Mick will come in and show us a bunch of elk rock and then also the suc. And there’s so many different variants of elk that are out there and you’re more likely going to be standing up your own and mixing and matching like this does this dashboard really cool.
This one, has really great, kibana visualizations and this one has a lot of pre ingester for filtering out, normalizing the data on the back end. You’re going to be mixing, matching those as much as you can.
So we are going to be discussing this some more, but we have a whole bunch of webcasts. Jordan found zero day in, some Amazon instances, specifically, as it relates to like hue and could, things like that.
I don’t want to say it’s a zero day, but definitely a vulnerability in the way that cloud computing was set up, for emr, elastic mapreduce. Oh, it’s back. Hooray.
Do this. This is awesome. Maybe, I think it was just actually my dashboard.
I shouldn’t get too excited because I can move my mouse now. We’re not quite there. I can move my mouse now on my helicopter. That’s good. So we have some questions.
we’re at the end of the webcast, but I got a question, Jason, for my question at the beginning of the webcast, for people that attended, did anyone mention crescents?
Crescent Hawk’s inception no rattletech. No, not a single person. No, not a single person. Because the answer to this question of somebody playing a video game, was Battletech.
The Crescent Hawks inception, in this story, it’s Jason Youngblood avenges his fallen classmates. it’s a fantastic video game.
I’m a little bit shocked that no one was able to pick that up, but you wouldn’t believe how hard it was for me to come up with a question that wasn’t immediately googleable, and somebody could google it in 2 seconds, have the answer.
So I guess I’ll have to keep making up these, little challenges, and we might have to increase the prize associated with it, but, yeah, the game he was playing was Battletech. The crescent Hawks inception.
What’s that? We get leisure suit Larry. every time you ask, just saying leisure suit Larry is the right answer to that question.
so I’ve actually got my screen up and running. My system unfroze itself, which is nice if we want to actually go through that. Here, let me show that really quick, and then we can close out. So the last thing I wanted to show before my system froze, but it came back, which I think is cool.
Cool. is this is just a lot of work that the team, has put together for HALC is actually identifying for log stash, for pipeline, for input, how to actually handle the data that it’s ingesting.
So each one of these are given a number, and it’s processes in order by number whenever it’s applying the configs and the rules. So these are actually the sysmon, filter rules, or, sorry, that’s sysmon.
These are the sysmon filter rules for log stash. And, of course, this comes from cyber war dog. Of course, he works at, Spectre ops. Amazing, amazing work. And this is just something I saw that was neat.
Where you can have the hashes and then where you actually put that data, they do some normalization of network connections for pulling in the network connections to a standardized name so it can be cross referenced with other different things.
so you can basically take an event id and put it into an action as a network connection. so these are the ingest rules for the different types of event logs that it can pull.
And, just a lot of work went into this, and I wanted to call it out, say I saw it. Mick actually walked me through a lot of this stuff, so very, very cool. All right, so any more questions in the last two minutes before we shut down the webcast?
Any quick pros and cons of using help versus using soft health. So I wouldn’t necessarily say that there’s dedicated pros and cons. I would say Sawfelk is probably a little bit more mature in what it’s doing because it’s designed for sans classes.
So Sawfelk is going to do what it does and it’s going to be rock solid because with the sans class, it has to work. however, I think that helc is being more ambitious insofar as what they’re doing.
So you may run into a few problems or things not being completely baked in with Helcurt, but it’s alpha. I mean, you go to the website, it’s very clearly alpha and they’re being a lot more ambitious.
So if you want something that’s kind of rock solid, if you go through a sans class and it works with soft elk, it’s going to work with you in your environment, because that’s just the way we roll at the Sans Institute. If you want something that’s kind of showing you a bleeding edge on where things are going.
Whenever you’re looking at Sigma and you’re looking at integrating the mitre, ATT and CK technique matrix, then you’re going to want to keep your eye on health. I would watch both, like I said, steal from one and the other to kind of build up your own elk kind of Frankenstein.
And that’s really a lot of the stuff that Nick is going to be talking about as well. Right. Any other questions?
CJ
Do you see extending this to mobile devices?
John
if the mobile device can throw logs, I honestly don’t know if there’s a way to get mobile devices to start throwing logs, like this. I don’t know. I think that that’s an interesting question.
I think that that’s a longer answer, though. I mean, you have the actual mobile devices, but honestly, the way I look at mobile devices moving forward into the future is the mobile device is merely a portal of the cloud services now.
you’re using Google and it’s accessing that Google cloud service, you’re taking pictures and it’s uploading it into a cloud service. If I was going to attack mobile devices today, I would specifically focus on attacking the cloud services.
So kind of an end way of looking at it. but as it stands now, I don’t believe so. But, if you’re going to be looking at this as a security concern, I would instead focus on the actual cloud services and making sure they’re secure than actually looking for malware executing on the mobile devices themselves.
CJ
Excellent. Is there a use case for using both Sysmon and OS query?
John
I think you could. It all depends. I would look at OS queries specifically for pulling information about the different operating systems. I’d lean that more towards like a systems administrator, help desk administrator, things of that nature.
as far as leaning one or the other, I really like sysmon because it’s an easier sell politically. whenever you’re talking about it, it’s super easy to set up and get it running on a Windows computer system.
It’s by Microsoft, it’s by marker Sonovich. So when you’re trying to sell the idea to Windows administrators, it’s an easier sell I think politically than something like OS query would be.
CJ
Do, if there’s any plans for help to use AWS elasticsearch as the back end for taking it out of elk?
John
not for help, but if you look at it, Amazon actually has their own, elk instance that they’re actually releasing.
you can actually pull down their whole, elastisearch instance in Amazon too. so Amazon is actually getting behind this as well. So yes, that is something that’s coming down the pipe and it would probably be pretty easy to take a lot of the health configurations and migrate it over to Amazon.
But as I said, right now it’s pretty much alpha. And just trying to get its basic functionality off the, off the ground is an ambitious task to say the least.
CJ
Got it. You’re out of time, my friend.
John
All right, well, thank you very much for attending, everybody. Bye everyone.