This webcast was originally published on Feb 11 2020.
In this video, Hal and John discuss advanced Unix command line techniques and their implications for system administration and security. They delve into various Unix commands, demonstrating their use in real-world scenarios to manipulate and analyze system data effectively. The session also covers SSH agent hacks and environment variables, providing insights into both productive uses and potential security risks.
- The webinar focuses on enhancing Linux and Unix skills, particularly for self-teaching IT professionals.
- The speaker emphasizes the power of Unix shell primitives and how they can be combined to perform complex tasks efficiently.
- The webinar introduces advanced techniques like log parsing and leveraging environment variables for both productive and security-related purposes.
Highlights
Full Video
Transcript
John Strand
All right, Hal, let’s go ahead and let’s kick this off. Everybody. I want to say thank you very much for attending today. This is a really cool webcast. We have someone, honestly that has been a mentor for me for a really long time.
I would say, hal, I taught for 15 years. And I would say you were very easily there all 15 years. And there’s a small number of people that I get super excited about anytime I get to hang out with them at the Sans Institute.
Doesn’t mean I don’t like other people. But Hal was easily the most intimidating person at Sans, for me because the dude’s like very tall and he’s got this booming voice that you’re going to hear in a second and he’s brilliant.
And I look at him very much as that mean big brother that’s always doing his best to teach me a lesson and make me better. And I’m really excited and honored that he’s here to teach us all lessons and make us better at Linux and Unix.
So with that, Hal, I’m going to hand it over to you.
Hal Pomeranz
Sir.
John Strand
Please take it away.
Hal Pomeranz
Wow. Mean big brother. Hmm. M okay. Wow.
Hey. So here I am. And actually I wanted to thank the black hills folks for putting this one together.
as I was mentioning on the pre show banter, I become, I don’t know, I guess a little bit concerned when I started in it.
a long, long, long time ago, I went to work in the industry and there were a lot of big it shops running Unix.
And I got to go work for a site where we had hundreds and hundreds and hundreds of Unix machines and we had a whole staff of Unix admins.
And I was the junior person on the totem pole and I got a lot of mentorship from the people I worked with and I learned a ton just at that first job.
But nowadays that’s not really the pattern, right? I mean we’ve got, if there’s, it’s maybe an army of one.
You’re kind of doing a lot of self teaching and it ends up, you end up with big gaps in your knowledge. Right. So I’ve been on this kind of crusade to get out some of the stuff that was imparted to me by more senior admins.
For a while there, Tim Medine and Ed SCOTUS and I were having a lot of fun writing the command line kung fu blog. And that was great until we ran out of ideas.
But there’s still years worth of content at that website. And the conceit for command line kung fu was we’d set ourselves a challenge and Tim would solve it with Powershell and Ed would do command Exe and I’d do Linux.
Which meant that I spent an hour working on each episode and Tim spent like 4 hours working on each episode and Ed would spend all day and that was great.
But if you’re somebody who jumps between platforms, it’s a great way to spruce up your knowledge across the different platforms.
By the way, you can see the URL here for the slides for this presentation. You’re welcome to grab the slides and play along at home.
There was also a question earlier about the Linux distro that I’m going to be using for the examples and I’m using a Centos machine, but I’m trying to play it right down the middle of the fairway here.
The stuff that I’m doing should work on certainly any Linux distro and probably on most Unix like operating systems. It’s all sort of rock and roll to me.
So why Unix? Well, for me it started at a time when Unix was the only game in town.
But still through the decades I’ve continued to use Unix operating systems and I still use them professionally in my forensics practice, which I take some amount of good natured ribbing from my colleagues like Hal.
why are you using that open source tool chain instead of one of the more integrated commercial forensic suites? And then they get a white screen of death.
And I’m still being productive and I’m laughing. for me it’s a lot about productivity, right? The learning a command shell.
And honestly I don’t really care whether it’s the Linux bash shell or whether it’s Powershell on windows, but being able to automate tasks and being able to do ad hoc kinds of queries and reports is a huge force multiplier for anybody in it.
So certainly learn to use the command shell of whatever platform you prefer. And Powershell is an entirely reasonable, shell for that.
In fact, in a lot of ways Powershell has more power than the Unix shell.
The other reason for Unix like operating systems is just sheer ubiquity. Windows is the dominant desktop platform. There’s no argument, you get no argument from me on that.
And there’s certainly a lot of Windows servers out there in the enterprise. But if you look globally, outside of that very large market share, everything else is a Unix like operating system.
And here I’m talking not just about Linux, but also about macOS, which I consider a Unix like operating system since it’s got a free BSD user space. Plus there’s Android, probably the single largest market share of Linux on the planet.
And now of course all of these wonderful embedded devices that are all over the places which are mostly just little arm processors running some Unix thing, whether it’s powering your refrigerator or your automobile or the thermostat on your wall.
And all of this stuff comes up in my investigations because it’s all over the place, right? Forensic aid, a DVR or whatever.
And the Linux things are important. They’re infrastructure, whether it’s attacker infrastructure, which I’ve gotten a chance to look at, or it’s somebody’s, no SQL database that’s been trashed and being held for bitcoin by some ransomware application.
It’s infrastructure and it’s important and it’s interesting. So go Unix.
Okay, so that’s the sales pitch, right? Let’s start throwing some commands around and I’m going to get off of the slides and actually just start doing this stuff live.
We’re going to start light. We’re just going to ease everybody into it and then we’re going to ramp it up. The unit shell designed to be a text processing environment.
It’s fundamentally designed around processing textual output. You can do very simple things like here’s a simple string searching example.
The setup here is that the deer run website, as you’ll notice if you spend any time on it, aside from being ugly, is also just completely static HTML.
I’m not using any kind of cms or anything like that. It’s amusing to me because I collect in my Apache logs all of the different WordPress attacks de jour or the drupal attacks de jour or whatever.
So I can see people just launching all of these different attacks, against my website. We’re going to do a little bit of investigation in my web logs and try and pull out some information about who’s doing nefarious stuff or trying to do nefarious stuff to my web server.
Wow, that’s like a double muting fail there because you failed to mute your phone and you failed to mute your microphone. It just builds up.
John Strand
Just roll with it.
Hal Pomeranz
All right, here we go. I’ve got some of the logs from my web server. What we’re going to do here is we’re going to grep for people trying to try various bogus, URL exploits.
A lot of them happen to contain the string plugins. It’s a WordPress plugin or something like that. We’re just going to look for the word plugins in my access log.
Very simple use of grep. We’re just looking for a simple string like plugins from the access log. Hm. Okay, Unix trivia question.
I’ll answer it at the end. Who knows where the command name Grep comes from? You can put it into the chat and improve your Unix dominance.
John Strand
Now is this something they’re going to be able just to like? It’s the wikipedia and it’s going to be done in 2 seconds. Or is there like a little bit more depth to this one?
Hal Pomeranz
No, I think that you could probably google this one and find the answer, but I’m just wondering, who knows this one off the top of their head? Anyway, Grep is the string searching thing in Unix.
Here we’re going to pull out all the lines from the access log that happen to match the word plugins. Cool.
Here is the apache log format. really I should say it’s the NCSA HTTP log format and I want to find who is responsible for this log format and have a long conversation with them.
Yeah, my hatred of this log format knows no bounds and yet you have to deal with it constantly because it’s the de facto standard web, log format.
So you can see in the first column we’ve got the requester, got the world’s least useful time and date stamp right here in the middle of the line.
You’ve got the method, you’ve got the URL that was requested in the protocol. Here’s the response code. You’ll notice these are all 404 not found because hey don’t exist on my website.
And the number of bytes returned, I’ve gone to the combined log format, which means I also have a refer field as well as the user agent string, the dumpster fire that is modern browser user agent strings.
Yay. That’s a lot of information and it’s a lot of distraction, frankly.
I’d like to get rid of some of this junk and just focus in on the sources that are requesting these nasty URL’s.
Okay, so really I just want the first whitespace delimited column from the output format. Ooh, whitespace delimited fields.
That means I get to use awk. Yes, awk, that language that everybody’s afraid of before.
John Strand
We’re trying to answer questions as fast as we can, but I’m going to let you finish, but I’m going to interrupt you real quick.
Hal Pomeranz
Okay, go ahead.
John Strand
This is interesting because you asked the question about Grep. If it’s a super easy question, like everyone hits the same answer again and again and again. This one floored me because I’m, going to give you the answers that we received.
Then afterwards you tell me which one is correct. We have the GNU regular expression generalized reg expression parser. It’s the name of some three tools. The Ed command, g r e p.
Global regular expression print stands for the original sets that were set to accomplish the task. Go retrieve exact phrase, global regular expression print. Git, regular expression print.
Just, and I’m not going to read the rest of them, but it was from the Unix pioneer. John Grep was the other one.
Hal Pomeranz
No, it’s the Ed command. It’s global g. And then you put in the regular expression you want to search for. That’s the re part. And then p means print out the matching lines.
So back in the day before we had even vi, you used a line editor like ed. And that’s how you found the lines in your file that matched your pattern. That’s where it comes from.
John Strand
The one person that got it or the first person that got it was Stefan. If you sir, could please, type in your email address, we’ll pick it up and we’ll send you a cool prize.
Well done, sir.
Hal Pomeranz
Nice. Randy, why didn’t you get that one? Okay. All right. My mission is to pull out the first field, whitespace delimited field from lines that match the keyword plugins.
Now look, I see people doing this kind of crap all the time and it drives me insane. Grep plugins access log pipe to awk. No no no, this is like one of my huge pet peeves.
Look, Awk has built in pattern matching, ladies and gentlemen, so I can search for lines that match plugins and then I want to print out field number one.
Easy peasy. Okay, there you go. Boom. All right, cool.
So that’s a little bit of awk. And by the way, if you download the slides, I’ve given you pointers in the slides to command line kung fu episodes and other resources where you can get more details about these commands that I’m throwing at you.
Okay, great. So we’re pulling out the sources that are sending these naughty URL’s. But you can see there’s some redundancy here.
I want to clean up the redundancy a little bit. I’m going to take the list of sources and I’m going to pipe it into another useful little idiom.
I’m going to sort them and this will just give me an alphabetic sort. I’m going to unique ify them. So I only get one copy of each unique value, but I’m going to count the number of times each unique value occurs.
You got to sort them first because unique only works on duplicate lines that are right next to each other. And then I’m going to sort the result numerically like so.
It’s a beautiful thing, right? So you can see exactly how many of these URL’s were requested by each of the different sources.
Okay, this is fundamentally the Unix design religion, right? You have all of these little primitive building blocks like Lego pieces, and you snap them together to quickly produce ad hoc tables and outputs like this.
And this is absolutely one of my favorite little idioms, this sort unique c sort n. I call it the command line histogram.
And you can do all sorts of things with all sorts of data using this little idiom. Like for example, it doesn’t have to be Apache web log data.
It could be something like process information. So here again, I’m printing out the first field with awk and I’m piping it into the exact same idiom.
Now I can see the number of processes by user on the system. Obviously the vast majority are root, but here’s that bad actor Hal, who’s got seven processes running on the machine.
I cannot emphasize enough how useful this idiom is, how often you want to count things like this. But in general, the power of the Unix shell is knowing all of these little primitives and knowing how to snap them together.
And that’s really a lot of what we were trying to show you in the command line kung fu blog. Once you, it’s like learning a language, right? once the basic grammatical elements, you can put them together in all sorts of fun and entertaining ways.
Okay, cool. Let me go back to the original data though.
Okay, so I’ve got, on each line, I’ve got, at the beginning of the line I’ve got the source, and at the end I’ve got the user agent string.
Now the user agent strings are problematic for aWk, because they contain spaces. Awk perceives each one of these separate words in the user agent string as a separate field.
So that’s an issue. And so we need a more flexible way of massaging the output.
If I want to have just the source along with its matching user agent strength, I need to get rid of everything in the middle.
For that, they’re set. The stream editor what I’m going to do is I want to basically match everything from these double dashes right after the requester all the way to this double quote here that starts the user agent string.
And if you’ll look closely, you’ll notice that there’s always going to be a double quote from the end of the referrer and then whitespace and then another double quote that begins the user agent string.
So I’m going to use that to anchor my pattern to get rid of this data back to grep again.
So I’m going to grep for plugins from the access log and then I’m going to pipe that into a said command. The streameditor I’m going to match a pattern and that pattern begins dash, space, dash.
And then there’s a bunch of other characters. Dot matches any character, star means multiple occurrences of any character. And I’m going to anchor the pattern at the other end with double quote space, double quote.
And I’m going to replace that. S is the substitution operator. I’m going to replace that with nothing. So I’m just going to delete it completely.
And there you go. Now, Wiseman once said, when giving a live demo, never say anything more instructive. Then watch this, watch this.
Okay, so there you go. You’ve got source and user agent string.
Oh rats, we’ve got the double quote at the end. Oh, all right, I’ll get rid of that too. Substitute a double quote.
Oops. Double quote at the end of the line. Dollar matches the end of the line with nothing. So you can just keep stacking up commands instead. It’s like a little programming language.
Boom, there you go. Okay, so there’s the double quote gone. Happy now? Good. All right, cool. And now we can once again use our little histogram idiom.
I want to see, is the source always using the same user agent string or are they using multiple user agent strings? Sort pipe to unique c pipe to sort n.
Actually, I’m not going to sort it at the end because I want things grouped together by the source, not by the count.
Okay. All right. And so for example, you can see, like this source right here, one, 7688 address.
So it’s using, some sort of python script to blink away at my website so that one looks automated. But then you’ve got this source right here, 4591 25 36, and it’s actually sending me multiple user agent strings.
Now, is that because programmatically, whatever’s trying these is just throwing in random user agent strings so that it doesn’t look like a bot?
Or is this maybe like a gateway where there’s multiple people behind some sort of address translation or VPN endpoint that are using different actual web browsers?
Well let’s find out. I’m going to grep for that source from my access log and let’s take a look at the URL’s that are being requested.
They’re looking for robots, txt. They’re looking for a bunch of URL’s that don’t exist on my website. To me this looks like some sort of automated scanning activity that’s merely using multiple different user agent strings to try and throw off the track.
Because all of the URL’s they’re looking for look like potential compromise URL’s.
John Strand
I just wanted to we got a couple questions but I also wanted to jump in. I think this type of deep dive, like when I learned like log parsing, this is the way we always used to parse logs you’d go through, take what you knew and throw it away and focus on the things you didn’t know until you got to something interesting.
Hal Pomeranz
Always be pivoting, that’s my motto here.
John Strand
The same process could be used if you’re using an elk stack or a sim to try to go through your logs and just swim in them and not try to expect the sim to tell you exactly what’s going on but actually look at the raw data.
Got a couple of questions I wanted to throw your way real quick. One awk print n is limited to whitespace delimiter. No, what is a quick way to modify on the fly, say to something like a semicolon, right?
Hal Pomeranz
Yeah. So ok, does whitespace as a delimiter by default, but it has a dash capital f option so that you can use other delimiters. So for example, I want to awk my password file which is colon delimited and I want to print out the username which is field one, and the user id which is field three.
Head just gives me the first ten lines by default. So yeah, so you can specify a delimiter with awk with f. Very cool.
John Strand
Someone else asked would it be possible for us to get your history of this session so we could share that with the webcast as well because you have the slides.
But the idea of being able to get the command history would be cool too. We had a couple of people ask.
Hal Pomeranz
Yeah, no, I’m totally down with that. we’ll figure out how to do that and we’ll make it available.
John Strand
Okay. And then one request. Whenever. And you’ve been doing this for a bunch of them, but when you get the results, kind of going through the results and, kind, of explaining what it is you were looking for is something somebody was asking.
But I see that you’re doing that anyway, so I’ll shut up now and carry on, sir.
Hal Pomeranz
Okay, cool. So the Unix design religion is putting things together with pipes and doing these kinds of crazy on the fly shenanigans.
And after that it’s just practice, right? It’s learning all the different primitives and what they can do for you. I’ve been doing it for 30 years.
I’m pretty good at it by now. so don’t feel bad about just starting with c spot run kind of thing in Unix.
You’ll pick it up as you go along. And the other thing is, like John and I were just talking about, if you’re swimming through your logs or you’re swimming through any other data, just follow your nose, just keep pivoting around and looking for interesting things, interesting patterns, and follow those leads wherever they take you.
sometimes you end up going down a rat hole, but sometimes it’s interesting. You just don’t know until you go there. Okay, so I want to change it up a little bit.
Let me actually get back on the slide where. A little bit. So the slide where kind of goes through a bunch of the examples that I showed you live.
I want to talk a little bit about environment variables. Everybody’s like, oh, God, how? Come on. Environment variables? Really? I mean, isn’t that kind of basic?
Yeah. Okay, let’s have some fun with environment variables. Yes, environment variables can do things like change your search path. I can even change my prompt.
Ps one is the environment, variable. If you set it, it changes your prompts. And I could be good where I could be evil.
Just depends on your mood. Oops. Yeah. Imagine how good I could be if I actually got the quoting right on that one. Go me. All right, there we go.
But there’s all kinds of really fun stuff that you can do with, environment variables and units. One of the ones that I actually really like for my forensic practice is there’s an environment variable called tz time zone, which sets your local time zone for this shell.
So, for example, right now I’m in Florida, us east, east coast. So I’m in eastern standard time. And if I do a listing of a file, like, I don’t know.
Etsyresolve.com. dot. I’m seeing the last modified time in eastern time, but maybe that time zone is inconvenient for my investigation so I can pick any time zone I would like.
So for example, if I wanted to be in UTC, the one true time zone, I just set that environment variable. And now when I run the date command or I run any other command that’s showing me a timestamp, it shows me that timestamp relative to the time zone that I’ve chosen.
just as a side note, I wanted to point out that in Unix Linux systems anyway, timezone files are stored in a directory called user share zone info.
And the time zone names that you set in the timezone environment variable are basically just file paths that are relative to the top of that directory.
You can see there’s UTC time right here, but there’s also for example the Europe subdirectory which contains different cities.
So for example I could say export tz equal to Greece or, sorry, Europe, Athens.
And now I’m in, eastern european time, which is the time zone that the city Athens is in. So that can be a convenient way if you don’t happen to know what the time zone is, you can do it by geographic location, just by browsing around in user share zone info.
So anyway, and by the way, I want to switch back to my home time zone. You can just unset the environment variable and here I am back in us eastern time again.
So that’s a fun one, but still. Ok, Hal, your prompt says you’re evil, but you’re just showing me stuff that a blue team type analyst would do.
What can we do that’s evil with environment variables? Hal? Well, let’s talk about SSH agent now, those of you who use Linux and Unix a lot, or Macs, probably are familiar with SSH agent.
It’s the program that runs in the background that stores your private keys that you use to log into other systems and your SSH clients communicate with that SSH agent process running in the background using a Unix socket.
And the file path to that Unix socket is stored in an environment variable and that environment variable is called Ssh authsoc.
there is the path to my ssh authsoc here in my shell. Once I have keys loaded into that, it becomes this frictionless way for me to log into remote systems without having to type my passphrase over and over again.
It’s hugely useful if you’re not using SSH agent. Oh, you’re missing out on a huge productivity increase. The problem is that, okay, now your private key is sitting in memory, in the memory of this SSH agent process, and potentially it could be scraped out of memory by somebody nefarious who’s broken root on the system.
The attacks on SSH agent keys and memory are well documented, and in the course slides I give you a pointer to, a blog that talks about how to attack SSH keys and memory.
But there’s actually an easier exploit that you can use to pivot off of systems where SSH agent is in use.
It’s a pivot that even works on systems where an SSH agent isn’t running but a user is using agent forwarding to forward a connection to a remote SSH agent on their primary desktop, for example.
But to do any of these exploits against SSH agent, I need to be root first I need to break root on the box. I am so leet that if you give me the root password, I can be root in no time at all.
Okay, so now I’m root here, let me to give myself the traditional root prompt.
Okay, there I am, right, okay, so I am root on this system. All right. Now, because I’m root, I can read any file. I can see everything on the system.
So in particular I want to find out the Ssh authsoc for different users because I want to basically hijack their SSH agent connection.
For that, I’m going to show you another one of my favorite Unix command line tools. The first thing I’m going to do is I’m going to look and see if there are any Ssh, agent processes running on the system.
Nope. Okay, so nobody’s running SSH agent on the system. How about agent forwarding though? Okay, now if agent forwarding is active, it means that the Ssh auth SoC is actually being managed by the user’s Ssh daemon process, the daemon that’s running their particular session.
So I want to know about Unix sockets that are owned by SSH daemon processes. And this is where it gets fun.
Okay, so I want Unix sockets that are also associated with ssh daemons. I love lsof.
Lsof is easily my favorite Unix command. It stands for list open files, and everything in Unix is a file.
So files are files and directories are files, but sockets are also treated like files. And so using Lsof you can get information about, open network connections, open files, and so on.
And lslf violates the Unix design principles. It combines the output of multiple tools into one uber tool.
And so here I’m asking for things that are Unix sockets and which are owned by commands named sshd.
You can see that for user. How? There’s my ssh agent path. Okay, cool.
Now what about environment variables? How? Well, okay, so now I’m root. I can read from that same socket just like the user can. So I’m going to export Ssh authsoc and I’m going to set it to the user’s ssh authsoc.
So now I can impersonate that user. I can use SSh add to list the keys that are present in that SSH agent process that I’m talking to.
This allows me to confirm that there are actually keys stored in that SSH agent that I can use. Where can I go?
Well, don’t really know, but remember that in the user’s home directory there is the known host file which stores the public keys of systems that the user has connected to from this machine.
Hey, remember awk?
The tilde means home directory of user Hal. In the SSH subdirectory I want to look at the known host file and I want to print out the first field, which is the host names or the IP addresses of the machines that the user has been visiting.
if they regularly visit those machines from this host and they’re using agent forwarding on this host, then chances are that agent is going to allow you to connect into one of these systems.
Now I have to do it as user how? Right.
it’s a beautiful thing. I’ve just logged into another system as user. How breaking root from here is up to you.
This is the great SSH agent pivot. I didn’t have to scrape a key out of memory or do anything tricky like that. I just had to talk to the same SSH agent process that the user was using.
It’s the Unix equivalent of a pass the hash attack. I’m leveraging their agent to log in as them other places on the network.
And in fact, if I actually log in as them using the dash capital a option, which is the agent forwarding option, then I still have access to that same SSH agent process and I can keep pivoting off of this system, still leveraging that SSH agent process that I pivoted off of from the previous system.
And that’s a beautiful thing because in general, like people who like to use SSH agent like to use the same keys in lots of different places.
And you can go pretty far with this technique. It’s a beautiful thing. And this is one of the things that we talk about in my securing Linux class with sans SSH agent is hugely convenient, but there’s always this trade off between convenience and security and you’re seeing the rough end of that stick here.
Okay, so that’s cool. A little bit of Ssh hacking with environment variables. Now the thing is, talking about history, your history is going to give you away.
I’m doing all these nefarious things and you can clearly see in my history this pattern of evil that I’ve done. To to do this, I need to make that go away.
Bash history anti forensics is a topic of much debate, but let me show you my favorite commands for making your history not be a problem for you.
There’s a couple of environment variables that you need to know about. One is export hisp size equal to zero, okay? And that nukes your history and memory.
And it also means that you won’t be accumulating any history from this point forward. Now you can’t see my history anymore.
If I run the history command. If I take a memory dump of this system and I do some string searching, I will probably be able to find remnants of that history floating around in memory for some period of time.
But it won’t be easily extractable with something like volatility’s Linux bash plugin. that nukes my history right now.
But the problem is if I exit this shell, it’s going to truncate the bash history file on disk to zero bytes. That’s going to tell people, oh, something hinky has been going on.
People have been blowing away bash history. So the other variable I wanted to introduce you to was export hist file dev null.
Okay, so this changes where the shell is going to write its history. And because I’ve set it to dev, null, when I exit this shell, it’s not going to impact the current history file on disk.
So dev null is just the BitBucket. Everything that goes in there goes away. So this combination of environment variables is sort of my preferred method for hiding what I’m doing.
John Strand
Someone just brought up a point that said you’re supposed to share your command history with us. Is what you just did going to wipe out the entire command history of everything that you did as part of.
Hal Pomeranz
This webcast in this root shell? But I can certainly recreate it for you, right? I mean, I know, I know the I know the examples I did. So, yes, you’d be surprised how.
John Strand
Quickly people were like, no, no, I think Hal can recreate it. It’ll be okay.
Hal Pomeranz
I can rebuild it. We’re okay. Don’t sweat it. We’re all good here. So. Yeah. So I did a little presentation for the forensic summit called you don’t know Jack about bash history.
And if you’re interested in, taking a look at that presentation, it actually was recorded, and it’s online, and you can get the slides from my website. It’s a lot more about batch history, forensics, and anti forensics there.
Okay, but let’s suppose you screwed up. Let’s suppose you, instead of nuking your history in this way, you actually did an RM on root bash history.
The problem with RM is that while it unlinks the file from the file system, the content of the file is still floating around in blocks that are now in the unallocated collection.
The original content of the file is not overwritten. It’s merely marked as unallocated. But until something overwrites those blocks, I can recover them.
This is a lot of what we do in forensics, recovering deleted data that was removed by attackers. Built into Unix. There is a secure delete command called shred.
Shred overwrites the file content with random data before removing the file. Shred u means overwrite and then unlink the file.
It’s like the secure delete version of rm. and if people use shred to delete their files, eh, it makes my life, well, pretty, pretty damn difficult.
Small caveat portions of the file may be available in the journal, the file system journal, for a short period of time after these kinds of file operations.
But in general, once somebody shreds a file, it’s pretty much gone. All right, but what happens if I screwed up?
What happens if I stupidly just removed bash history? Boom. I made it go away from the file system, but I know the content is still floating around and unallocated.
Believe me, it’s very easy for me to find deleted bash history and unallocated, right? Because there’s patterns like rm and then a file path like rm something or cd something that make it easy to locate shell commands floating around in unallocated.
So I got to somehow overwrite that data that I’ve accidentally put into unallocated that I don’t want people to see.
Here’s a cute trick for that. I want to make sure that I’m in the file system.
Root is here in this volume. I want to blow away, overwrite all of the unallocated blocks in this file system where I deleted roots.
Bash history dd. My input file is going to be dev urandom. My output file is just going to be a file named junk.
To make this go faster, I’m going to set the block size to 1 mb m. When the DD command fills up the disk, it’s going to exit. When it exits, I’m going to remove the file called junk that I just created.
What’s going to happen here? The DD command is going to make a giant file that eats up all of the remaining available disk space full of random data.
Then when the disk is filled up, the DD command exits and I just delete the big file that I created, freeing up all that space. But the net effect is that I’ve wiped all of the unallocated and just filled it up with random data.
Now this is going to take a while, right? And I’ll just let that go. But anyway, that’s the last example I wanted to do anyway, so we can just let that go in the background.
So, and actually let me show you while we’re here. Built into the bash shell, there is a benchmarking primitive.
Oops. So I can take the same command line and do time at the front of it. And whenever this finishes, it’ll show me exactly how much time this command took.
All right, so I’m going to let that go in the background. We’ll check back in on it in a few minutes when it’s had a chance to fill up the file system again. The slides have a lot of information that I showed you already, but I could do this all day, right?
There’s so many cool command line tricks that I haven’t had a chance to show you in this 1 hour webcast. And based on the number of people who signed up for this sucker, we were curious if we put together a one or two day command line skills, class, would people be willing to pay money for that?
So, like maybe $200 for a one day online class, something like that. So anyway, if you’d be interested in, a longer training session possibly where you had to pay for it, please drop cl dojo into the chat just so we can sort of get a feel for how many people would be interested.
Also, if you have specific topics that you are just dying to learn, that would help me put the class together. So, feel free to drop those into the chat as well.
John Strand
So yeah, if you could put in the Cl dojo if you’re interested right now. And then also there was a ton of questions and how we’ve got the full accounting of all the questions they asked. we tried to do our best to keep up and we were doing fine right up until the point that the cl dojo thing happened.
And now I’m giving m up. I’ve got some questions at the top for people that I would like to kind of run through in the last ten minutes if you don’t mind.
Hal Pomeranz
Yeah, absolutely. That’s why I left some time here.
John Strand
All right, awesome. The first one that came up and I’m going to blame malware Jake for this, but somebody put in. I would love to hear Howl’s take on system D.
Hal Pomeranz
And if you read Jake’s Twitter feed, he is utterly bagging on system D. And I have to say that you cannot count me in the system d haters camp.
And this is 30 plus years of Unix talking. I actually think that system dynamic is good because the engineers at Red Hat who developed systemd actually understood the problem space and what was wrong with existing Unix booters.
So systemd has some things in it that are really worthwhile. The whole architecture around dependency trees for applications stacks is only becoming more important as our application stacks become more and more complicated.
You’ve got kubernetes orchestrating a bunch of docker containers. Inside the docker containers there’s a three tier web app with a database and some middleware and a web server.
The cool thing about systemd, which is also true of other things like upstart Ubuntu and things like that, is you can define a dependency graph so that if something breaks lower in the stack it gracefully shuts down all of the other layers of the stack and then brings them back up automatically when you fix the problem at the lower layer.
And that makes that a lot easier. The other thing is that system D has features like start on demand services. So you can have system D bind to a port and then it can, when the service is requested for the first time start that service.
And that’s important because if you think about the way Unix systems are deployed right now, they’re in multi tenant environments where you have thousands of virtual instances running on a big hypervisor.
Now if I have to reboot that hypervisor and restart all of those virtual instances and they have to restart all of their services all at effectively the same time, that’s a huge load on the hypervisor.
Start on demand lets me spread out that load and only start the services as necessary. It makes sense for the way we deploy Linux systems these days.
Now, from an administration perspective, system D is also a good thing we’re going to put this in the background and because I can do things like system control status sshd, I see not only, hey look, sshd is running, I get to see all the logs.
Those of you who’ve been doing Unix admin for a long time know the way this happened in the bad old days, right? If sshd went away, you had to go into var log and grep through the log files until you remembered which log it was that actually got the errors from SshD, right.
And it was a big pain in the butt. So with this I can like we’re one stop shop, get the status of the service and all of the logs associated with it.
The fact that all the Linux distros now are going to system D means that there’s now one single command line interface for managing system services.
It’s not like the old days where red hat had check config and debian you were still using scripts and et cetera. Net D and Ubuntu had upstart and all this stuff.
There’s homogeneity. And if you’re somebody who jumps around between Linux platforms, that is a huge time saver. So actually count me in. The system d is good camp and Jake and I can have a cage match about this later on.
But I’m right and he’s wrong.
John Strand
And there’s a lot of people, I think that whenever they look at system d that change happened and then commands like ifconfig switched over to ipa and things like that.
And I think it’s a mistake to try to lump all of those changes together and say that that’s system d because it’s really not. And I’m not saying that’s what Jake does, but I’ve been with people, oh, I hate system d.
It got rid of ifconfig and I’m like, that’s not what you think.
Hal Pomeranz
That change was happening long before system.
John Strand
D. That’s not, no, they’re different things. I just think that for some distributions they hit at the same time in the default installation and they think that they were all together.
Another one, does the ssh agent only work if they’re actively logged into the system?
Hal Pomeranz
Yeah. So generally SSH agent starts when the user starts their windowing environment. So typically there’s an SSH agent process running for the user who’s logged in on the console.
But the caveat is what I just showed you in the example in this webcast, people who are doing agent forwarding in my case I was doing agent forwarding from my Mac operating system into this Linux virtual machine.
The SSH agent process is actually running on my Mac, but I’m accessing it over the forwarded agent connection.
John Strand
Very cool. Brandon asked, is there an alternative for LSOf? It’s no longer installed by default on his red hat enterprise Linux servers, which I thought LSof was still installed by default, but.
Hal Pomeranz
Yeah. Well, shame on red hat. I mean. Cause it’s only the single most useful Unix admin command in the world, but okay. I mean, no, but there’s. Yum. Install Lsof.
John Strand
Yeah, it takes just a second.
Hal Pomeranz
Yeah.
John Strand
So now you mentioned a talk. You went over it very quickly. It was something like you don’t know anything about.
Hal Pomeranz
You don’t know Jack about batch history.
John Strand
Is that online somewhere that people could pull that down?
Hal Pomeranz
Yeah. So in the slides for this talk, I actually gave the URL for that talk. Also, if you go to YouTube, they recorded me giving that presentation at DFIR summit.
You can actually see me giving a, kind of high speed version of that talk in a 30 minutes slot at DFIR summit.
John Strand
I think I’ve actually seen that it is the entire talk in 30 minutes. pretty cool. Will there be an SSH agent left in a screen or a Tmux session?
If the user started from a Windows session as well?
Hal Pomeranz
Yeah. if they logged into the Unix machine with agent forwarding and then started screen and left it running, then that agent connection will persist.
No it won’t. No, it won’t. Because the minute they exit the SSH connection that’s connecting to that machine, it breaks the agent forwarding. Right. Because the agent forwarding is carried over that Ssh login session.
So as soon as the Ssh login session goes away, so does the agent forwarding. Now the ssh auth sock variable will still be set in that user shell, but there won’t be any way to talk to the SSH agent process.
Okay, my DD command just finished. So there we go.
John Strand
Very cool. And a bunch of people found, the YouTube link and we’ve got that out to everybody. Yeah, I was surprised.
Hal Pomeranz
You can pop my name into Google and you’ll find a number of different Linux related presentations. There’s one about Selinux and stuff like that.
John Strand
Just google me, it’ll show up.
Hal Pomeranz
Yeah, I wasn’t, I was trying not to be that guy.
John Strand
I was going to pull you into it, though. I was going to pull you into it. I also love it when people type in. Did it just crash for everyone or just me? It’s probably just you.
So here’s the deal. Everything blew up with the cl dojo or. So, we’re going to get this cut. We’re going to get it online for everybody and we’ll let sometime in later on in the year.
And we will absolutely be doing more with Hal. So thank you so much for coming on, Hal. I really appreciate it and I hope you had a good time. And I’m going to throw this last one in.
By the way, I have a question. I, usually don’t do this with people’s desktops, but I’m going to do it with you. So you’ve got block dir header and entries, block dir tail hashes, block dir tail post delete.
What do you have there? Or can you not talk about it?
Hal Pomeranz
No, actually. So those of you who’ve been following stuff I’ve been doing over on my righteous it blog, I was doing a bunch of articles about XFS internals and I sort of got interrupted by casework in the middle of that.
But those are screen captures of sin Eliza grammars for, for different internal XFS data structures that are going to become part of that series of, presentations.
I’m the guy who likes to look at file system data structures in a hex editor. That’s just my thing. That’s what that stuff is.
John Strand
Cool. All right, Hal, any final words before we kill the recording?
Hal Pomeranz
Hey, Linux is awesome. Get you some linux.