Agent7 – Security Agent
On our 500 host network, tell me how many of our privileged users that are stale BUT active have scheduled or startup tasks running from a directory that is NOT C:\Program Files\*, within the last week. Oh and get that to me by end of day. Thanks!
Let’s pretend you are given the task above. How would you do it? Its nearly impossible or VERY difficult.. Well Agent7 could give you this data in a single query! If interested.. keep reading! (If you actually get this request, fix your resume and quit your job. Just get out of there).
Ill go into more detail below but you can view the project on Github. Free for anyone to download and use (and submit PR’s!). I finally got some time this past weekend to containerize most of the monolith and now you can install the server and agent within a couple of minutes (thanks to Docker).
So what is it?
Agent7 is a monitoring, detection & response security agent/server for Windows. It supports anything above Windows 7 and Server 08. At its core, it collects a bunch of information from the endpoint, sends it to the server where the data is enriched and dashboards/alerts are generated automatically. I developed this agent/server because I was interested in learning the Windows API and eventually it grew into a security agent.
What data does it collect?
Agent7 collects the following data from the endpoint. As you can see, it can also collect data from Active Directory.
- Scheduled tasks
- Logged on users
- Network Connections
- Network Shares
- Local Users
- Local Groups
- System Metrics
- Start up tasks
- Memory and Disk space
- Network Pipes
- Network Sessions
- Registry Keys
- Domain Users
- Domain Groups
- Domain Computers
- Domains & Domain Controller’s
- SysVol and permissions
Does it only collect data?
Well no… sure it collects and normalizes the data points listed above but it also has a Remote Interactive or Shell capability. If you have ever used CrowdStrike RTR, this has the same feature. You can remote into a host (or many hosts!) and run commands. This has a ton of capabilities but it can also be dangerous… which is why you can lock down which commands a specific user is allowed to use. But its still fun to run a query from the console and query all agents to see if a specific registry key is set (or not).
Why is Agent7 useful?
In the open-source world, its actually tough to find a decent monitoring, detection and response agent that also has a decent server and powerful API. OSSEC, Grr, Osquery, Sysmon come to mind when you think of monitoring and those tools are great but they take hundreds of hours to get any meaningful and actionable data (ehh Grr is actually good here). Don’t believe me? Go ahead and install Sysmon. Then figure out how to collect the logs. Then ship them and store them. Then query the data in a console. Then develop IOC’s or similar. Its super tough and Agent7 alleviates all of that. Of course those tools also excel at certain areas where Agent7 does not. But most importantly, Agent7 provides a very powerful API so that you can create your own analytics.
How do I get started?
Its pretty easy. Just grab the `docker-compose.yml` file in that Github repo and make sure docker is installed. Please view the README in Github for more information.
Lets look at the architecture and how this thing works
So lets dive into how the Agent7 agent/server works. There are also many things that I would like to change and Ill note this as we dive into the workings.
So this is basically how the architecture looks. There are 5 containers running with the docker internal network; 3 custom containers (agent7_ui, agent7_controller and agent7_poller), Postgresql and RabbitMQ. The agent is a .exe that is executed on the endpoint and it connects back to the Agent7 server. The only container that is exposed or routable to the outside world is the Agent7_ui (web console basically).
Agent7 endpoint agent
You run .\agent7_installer.exe on the endpoint and then it registers with the Agent7 server. There is a shared Site Key that the agent must have to register with the server. Upon successful registration, the server generates an agent specific token and passes it back. The agent saves this token in the registry. This token basically gives the agent the permission to only POST data to certain endpoints. This is one of the areas that must be improved. Currently every agent stores a token locally and the server stores ALL tokens. That’s basically how authentication happens. If the tokens match, the agent is allowed to POST data.
This isn’t ideal b/c every time the agent checks into the server (to POST data or get a command), the server must perform a database lookup and make sure the token is valid. The plan is to use JWT’s instead which alleviate this and allow you to break apart the control/data plane. I.e. Other containers or a API gateway could authenticate an agent because they can verify (via the public key) that the JWT is valid. The agent sends data over an TLS encrypted channel but there is a possibility in the future to also encrypt the actual data as well… but we will see.
But anyway, the agent checks into the server between 20-60 seconds. It gets a job from the server such as “Collect logged on users and Startup tasks” and then it performs the action and sends back the data. So this is all based on a Poll interval. Obviously if you log in and out very quickly, the agent may NOT pick that data up… but also from my research your windows session does not disappear immediately when you log off. You can set a poll interval for specific actions. Installed Software doesn’t change that much so maybe tell the agent to send that data once a day but network connections/logged on users change much more frequently so you can collect data every 30 seconds.
This container is basically the “fat” server component and also what you log into to view all the data that the agents send. Its basically just a Flask application with JQuery. The agent sends data directly to this container. When the agent7_ui server gets the data, it authenticates the token and then immediately places the data in the RabbitMQ queue. This flow will change a bit when JWT’s are used instead.
This container just watches the RabbitMQ queue. When it finds new data in the queue (i.e. from agents posting data) it parses the data, enriches it, and creates a hash of a few fields in the data. This hash is called the message_id. The connector then queries its local SQlite3 database and says “Have I seen this data before?”. We have to do this analysis because the agent is based on a Poll interval and may send the same data. If its in fact new data, the connector inserts the data into the Postgres database and updates the local ledger.
This container replaced what Redis use to do. I was using Redis and RQ/RQ-Scheduler to run background tasks when Agent7 was a big monolith but obviously that doesn’t really work with single purpose containers. Agent7_poller allows you to write small python methods to perform “background” jobs. For example, one of the jobs is that Poller queries all network connections in the Postgres database and provides GeoIP data enrichment. This is probably the least mature area of Agent7… mostly b/c I ripped a extremely stable and battle tested Redis for a wonky Poller. But it actually works good. The big improvement here is adding a new queue in RabbitMQ. Ill explain this in the section below.
RabbitMQ is just a queue. It’s a fantastic product and I chose this over Kafka b/c its dead simple and doesn’t have as much dependencies (such as Zookeeper). If you are not familiar with Queues, they just let you place data in the queue and something (called consumers) take the data and do whatever. Its a great way to decouple your applications and also “fan-out” data to multiple destinations. The big improvement here is Agent7 only uses 1 queue. I would like to use the first queue as a “raw” queue where unparsed, and unenriched data is placed (as it is now). There will be another queue that holds the parsed and enriched data. Agent7_poller could then perform its analysis before sending the data onto Postgres. But anyway, RabbitMQ works great and I highly highly recommend the product.
So that was a decent overview of the containers/services in the architecture. Ill touch on Remote Interactive and the API before closing this post out.
Below is a image of the Remote Interactive module that allows users to remote/shell into a host and run commands. Great for incident response. You can also easily run commands across your entire fleet.
And lastly, my favorite part of Agent7… the API. It allows you to develop some cool queries and answer nearly any questions about your fleet. Remember that ridiculous request at the beginning of this post? It can be easily solved with this query below:
/api/agent/users?as_json=True;filter=last_logon,gt,2 months ago;is_priv,eq,1;active,eq,true;models=tasks,schtasks;cmd_line,notlike,C:\Program Files\*
Its a complicated request… but its basically just chaining a bunch of Filters together and translating that query in a SQL query. I opened sourced that library that performs the “URI -> SQL” translation as well.
But lets look at another example that is pretty simple but very helpful. “Tell me all software that has been installed in the last 5 days” (you can even do by specific publisher’s that are not approved)
I think that’s all for now! If interested, please give the project a try and even submit PR’s! They are definitely welcome.