Posts bydmitryc

Using Machine Learning To Detect Anomalies

I’m going to start blogging more about detection of protocol/app anomalies, detection of lateral movement and/or data exfiltration, and more. For many years I have been watching users and applications furrow their way across networks and I’m gonna start data-dumping that info here 🙂

But…first…I manage a web server for a friend. It occurred to me that machine-learning could be useful in alerting when an attack is under way. I took the following steps

1) Get as much data as possible for this device. For Apache, this just meant gathering all the log files.

2) Parse the data and, for each session, look at the path taken as the user or bot perused the server (Note: outside of my initial scope, but timestamps are useful here to weed out a user versus a machine).

3) So, an average session will look like R1->R2->R3->RX where each “R” is a request. So R1 could be index.html, R2 could be “Contact Us”, R3 could be “contact_form.php”, etc. I started using Markov to build a model; however, instead, I took each set of 2 and initialized those values…e.g. S={R1->R2,R2->R3,R3->RX}. For the next session I might have S={R1->R5,R5->R3,etc.}. At the end of all the parsing, I have a big set of all state transitions possible for each R. So, given RX, there are a finite number of R states that RX can transition to.

4) For each of the R states, I now re-parse the log file and find the number of transitions. This is a matrix that shows the number of observed transitions from RN to every other R state. So, for instance, let’s say that R1 goes to 3 possible states : R4 (27% of time), R11 (3% of time) , and R12 (70% of time). Then the R1 row of our matrix looks like [0, 0, 0, .27, 0, 0, 0, 0, 0, 0, .03, .7]

5) There were some special cases that I had to account for (any page transitioning to the main page, any page transitioning to itself, etc.). Once I accounted for these, I ran my program against the log files and created LOW, MEDIUM, and HIGH alerts. I didn’t use a true standard deviation and I ignored the LOW and MEDIUM stuff…I just wanted the hits where the number for that transition was extremely low or 0. From our example above, this would be a transition like R1->R2=0. I didn’t really expect great results and figured that I would have to do a lot more tweaking…well, this wasn’t the case. I actually got really, really good data on my first run. Example:

732 total state transitions tracked
HIGH RISK GET /componentes3.7/fckeditor/editor/fckeditor.html->GET /affiliate/affiliate53/fckeditor/editor/fckeditor.html

HIGH RISK GET /portfolio/aui/FCKeditor/editor/fckeditor.html->GET /componentes3.7/fckeditor/editor/fckeditor.html

HIGH RISK GET /wp-content/uploads/wpfouot.php->POST /wp-content/plugins/Login-wall-etgFB/login_wall.php


So, I can use really basic machine learning to find my attackers in my web logs. I then parse out the attackers’ IP addresses and can throw them into a firewall ruleset. In the future, I would like to automate this and find when my server is under attack, send a message to my firewall which drops in a route rule which spins all of the attackers traffic to my honey net 🙂

Speaking of honeypots, You can also honeypot certain pages. For instance, I could create bogus files or directories based on what I see attackers going after (like the report from above) and drop canary tokens in there to (see Canary Tools). I can embed honeypot links within HTML comments and see where bots (or humans) are taking links from commented code and trying them out. I can put links in my robots.txt file and see who goes after them…there are so many ways to do this…and, at the end of the day, I can either run these attackers off my network or into a fake network…it’s just TONS and TONS of fun 🙂


    SecuriTeam Secure Disclosure

    SecuriTeam Secure Disclosure (SSD) helps researchers turn their vulnerability discovery skills into a highly paid career. Contact SSD to get the most for your hard work.

Oracle CSO is right

The internet (or at least twitter) is exploding regarding this, now deleted, post : Mary Ann Davidson blog post

Let me start by saying that she is right. Yes, she’s right. Breaking the EULA is against the law. You can’t argue about that.

You can’t argue that they should be paying a bug bounty. You may *want* them to pay a bug bounty, but that is the companies decision. If they choose not to pay a bug bounty, that’s their prerogative.

As a consumer, you can choose to use their product (EULA and all) or not. That is something that you have control over.

As a researcher, you can choose to break the EULA or not. Arguing that someone should modify their EULA so that what you’re doing isn’t a violation is childish.

I wish Oracle had stood by their CSO and left the blog online. I understand that they don’t want additional scrutiny on their product, but the scrutiny will be there irregardless (as it has been for many years now). Leaving the post online would have shown some ‘backbone’. If INFOSEC goes PC, it’s bad for us all. I’d rather someone tell me what they really think and we can go from there.


    SecuriTeam Secure Disclosure

    SecuriTeam Secure Disclosure (SSD) helps researchers turn their vulnerability discovery skills into a highly paid career. Contact SSD to get the most for your hard work.

Play some D!

Hi there. Long-time-no-blog 🙂

If you haven’t already, go read this:

Note: this blog applies to Corporate networks. If you’re a coffee shop or a college, you’re on your own 🙂

I’ve been a network defender for many years. I currently work for a software company that builds network software which helps companies gain insight into how their network is being used and/or abused. I didn’t choose to go into network defense – it chose me. In 1997 at my first “real job” out of college, I was a part of a team that tracked down some hackers that were running around owning a bunch of Solaris servers. From that day, I was hooked.

Network defenders don’t get a lot of credit. If you do your job right, no one ever talks about it. If you do your job wrong, you’ll hear about it every day for the rest or your short-lived career. An attacker can be wrong a million times and only needs to be right once. That’s an advantage. An attacker can spend 2 years in the bowels of one software app. A defender cannot. Accept this fact and move on…we can still win. The attacker has to use your network whilst evading detection. A lot of them don’t spend a lot of time figuring out how to do this right. They don’t have to be stealthy about exfiltrating data because it hasn’t mattered – the defense has been weak. How many recent infections used the darknet as a C&C?…ummm, your network monitoring solution should be SCREAMING AT YOU if someone connects out via Tor or i2p.

The network is like a bodies immune system (though not nearly as complex). The job, if you’re up to it, is to be the immune system. You can’t stop all infections from getting in. In fact, it can be argued that infections must get in to build the immune system. Firewalls and other devices can block things that we have knowledge of; however, something that we haven’t previously encountered will eventually get in (maybe via email, hacked USB drive, 0-day, whatever). Our job is to detect the foreign body, eradicate it, and update the immune system such that that strain of virus can no longer get it. So, how can you do this?

1) know what is “normal” for each host on your network. What ports do they offer? What ports do they connect to? What do their traffic patterns look like for each port? Who do they talk to? Who talks to them? what network protocols do they speak? How long do sessions stay nailed up? If you know this sort of stuff, then an attacker exfiltrating a gig of data cannot be hidden…it’ll stick out like a clown at an IBM business meeting.

2) Method 1 will detect lateral movement, but if you employ dead space within your network, you can flag lateral movement with just a single packet. Use honeynets, host-based IDS, traffic analysis (why is engineering dept trying to talk to HR?), etc. Spray your databases with bogus data that should never be accessed. Put up fake file servers and watch for access or watermark the files and watch them if they move around the network. Be creative…make your network a hostile environment for those who would attack it. The locals know how to get around, the attacker will have to figure out how to move around the network. Make this a painful process for him/her.

3) Look for invalid use of standard ports. Have you ever seen Skype find an “out door” on a network…What about vpn, i2p, p2p, Tor,etc.? Sending outbound traffic over well known ports is very, very common on most networks I have monitored. For each outbound port allowed through your firewall, you should flag on anomalous traffic over that port. What is anomalous? If the port is 80, only valid HTTP should flow over that port. If the port is 443, only TLS/SSL should flow over that port. Find the people tunneling data or sessions out of your network and you have a short list of the folks to keep an eye on.

4) Let the users know that you are watching. If Mabel from Accounting comes in on Monday morning and uploads 2 gig of baby pictures to dropbox, you should go have a chat with her. Get the word out. User education is often overlooked…millions is spent on nifty software but you don’t even have a full time employee working on user education. Sad.

There’s a lot more that I could write, but network defense isn’t a “cookie cutter” operation. Each admin will have to be creative and come up with their own maze for the attackers to run. Good luck out there!


    SecuriTeam Secure Disclosure

    SecuriTeam Secure Disclosure (SSD) helps researchers turn their vulnerability discovery skills into a highly paid career. Contact SSD to get the most for your hard work.

Settle for nothing now … Settle for nothing later!

We settle for this. We, the consumers, are the problem! I don’t have much more to say…a picture is worth a thousand words.

bug bonanza

    SecuriTeam Secure Disclosure

    SecuriTeam Secure Disclosure (SSD) helps researchers turn their vulnerability discovery skills into a highly paid career. Contact SSD to get the most for your hard work.


BananaGlee. I just love saying that word 😉

So, was reading up on the NSA backdoors for Cisco and other OSes,, and got to thinking about how the NSA might exfiltrate their data or run updates…It’s gotta be pretty stealthy, and I’m sure they have means of reflecting data to/from their Remote Operations Center (ROC) in such a way that you can’t merely look at odd destination IPs from your network.

This got me thinking about how I would find such data on a network. First off, obviously, I’d have to tap the firewall between firewall and edge router. I’d also want to tap the firewall for all internal connections. Each of these taps would be duplicated to a separate network card on a passive device.

1) eliminate all traffic that originated from one interface and went out another interface. This has to be an exact match. I would think any changes outside of TTL would be something that would have to be looked at.

2) what is left after (1) would have to be traffic originating from the firewall (although not necessarily using the firewalls IP or MAC). That’s gotta be a much smaller set of data.

3) With the data set from (2), you’ve gotta just start tracing through each one.

This would, no doubt, be tons of fun. I don’t know how often the device phones home to the ROC, what protocol they might use , etc…

If anyone has any ideas, I’d love to hear them. I find this extremely fascinating.

    SecuriTeam Secure Disclosure

    SecuriTeam Secure Disclosure (SSD) helps researchers turn their vulnerability discovery skills into a highly paid career. Contact SSD to get the most for your hard work.

The Users are smarter than we give them credit for

So, my boss had asked me last week to read the Mandiant report and see how these Chinese APT1 attacks could be detected on a network both during and after an attack. After reading the report, I was pretty saddened by just how little has been done in the last 20 years in Infosec. The tactics and protocols used to steal data are old (decades old) and stale. My initial reaction was, and is, that user’s are still not being properly educated AND held responsible for their actions. We’re letting the users off too easily! Corporations are still trying to solve a people problem with software or appliances.

Take a look at the top 15 Security startups of 2013 ( Now, look at how many of these software products ASSUME that the user will do the wrong thing and click on a link or an attachment. We have sandbox technology so that when the user downloads the malware, software can fix it (remember Pelican SafeTNet from late 90’s early 2000’s). We have software that steers employees away from bad websites (how does this work? A list of bad sites won’t work…downloading the page and running static checks won’t work…I dunno…would be interesting to hear more, but I digress).

Look, if your kids were prone to starting fires while cooking food, is the fix to create a million dollar stove that auto-senses when the heat is too high or when the smell of burnt food is in the air and automatically shuts down? Or, is the fix to teach your kids the proper way to use the stove? If I was a Corporate Security officer, I would make user education a top priority. I would even be willing to bring in a company that specialized in user security education (train the trainer type stuff). That would be money well spent. Every new user gets a class in computer security complete with a hands-on lab, test, and an Acceptable Use policy that they sign after completion. Existing users have to “re-certify” every year when they get a performance review.

Next, hold the user accountable for their actions after completing said training. In this day and age, a compromised computer inside the network is a license to steal. Having a computer with Internet access is a serious responsibility. If you mess up and do what you were trained NOT to do, then you are punished. Keep messing up and you get your pink slip. The user’s aren’t as stupid as we make them out to be. If their actions impact their bottom line, they will act accordingly. If we don’t hold the user responsible, why do they have any reason to change their behavior?

And, on a related tangent, maybe I’m just too old school but I don’t understand why a company would allow their employees (paid to do a Corporate-related job) to surf social media, p2p, job-search sites, dating sites, web-based email, etc. etc.



    SecuriTeam Secure Disclosure

    SecuriTeam Secure Disclosure (SSD) helps researchers turn their vulnerability discovery skills into a highly paid career. Contact SSD to get the most for your hard work.

Congrats to UNC Charlotte



I had the chance to hang out at the SECCDC yesterday at Kennesaw State Univ.  For those not familiar with these events (I wasn’t either, until yesterday), you have colleges who bring in teams to defend against a ‘red team’.  UNC Charlotte defended their network better than the other colleges.  It was interesting to see these schools throwing in block filters, redirects, etc. on the fly.  Impressive from a bunch of college students.  The red team was equally impressive.  There wasn’t a box that they didn’t, at some point, root thoroughly…


One interesting note.  During the competition, there was a full power outage.  UPSes died.  Images were lost.  Router configs were killed.  It generally set the entire competition back a few hours (at least).  Just a reminder that physical security is every bit as important as the logical security….




    SecuriTeam Secure Disclosure

    SecuriTeam Secure Disclosure (SSD) helps researchers turn their vulnerability discovery skills into a highly paid career. Contact SSD to get the most for your hard work.