The Datadog Agent is lightweight piece of software that runs on your hosts. Its job is to faithfully collect events and metrics and bring them to Datadog on your behalf so that you can do something useful with your monitoring and performance data.
The source code for the Datadog Agent can be found here.
The agent is composed of 4 major components, each written in Python running as a separate process:
- Collector (agent.py) - The collector runs checks on the current machine for whatever integrations you have, it captures system metrics like memory and CPU.
- Dogstatsd (dogstatsd.py) - This is a StatsD backend server, it's responsible for aggregating local metrics sent from your code
- Forwarder (ddagent.py) - The forwarder is pushed data from both dogstatsd and the collector and queues it up to be sent to Datadog.
- SupervisorD This is all controlled by a single supervisor process. We keep this separate so you don’t have to have the overhead of each application if you don’t want to run all parts (though we generally recommend you do).
To learn about extending agent checks or writing your own see here.
Note to Windows users: All four agent processes will appear as instances of ddagent.exe with the description "DevOps’ best friend".
In terms of resource consumption the Datadog agent consumes roughly:
- CPU: ~ 0.12% of the CPU used on average
- Memory: ~ 55Mo of RAM used
- Network bandwidth: ~ 86 B/s ▼ | 260 B/s ▲
- CPU: ~ 0.35% of the CPU used on avg
- Memory: ~ 115Mo of RAM used. Note: since v. 5.15 of the container agent, we recommend setting container resources to at least 256MB due to an added memory cache -- upping the limit is not to account for baseline usage but rather to accommodate temporary spikes. The agent 6 has a much more limited memory footprint.
- Network bandwidth: ~ 1900 B/s ▼ | 800 B/s ▲
- Linux 120MB
- Windows: 60MB
Tests where made on an AWS EC2 machine c5.xlarge instance (4 VCPU/ 8GB RAM). The vanilla datadog-agent was running with a process check to monitor the agent itself.
Enabling more standard/custom integrations of the agent may increase the resource consumption of the agent.
Enabling JMX Checks forces the agent to use more memory depending on the number of beans exposed by the monitored JVMs.
Enabling the trace and the process agents increases the resources consumption
Supervision, Privileges and Network Ports
A Supervisord master process runs as the dd-agent user, and all forked subprocesses run as the same user. This applies to any system call (`iostat`/`netstat`) initiated by the Datadog agent as well. The agent configuration resides at /etc/dd-agent/datadog.conf and /etc/dd-agent/conf.d. All configuration must be readable by dd-agent. The recommended permissions are 0600 since configuration files contain your API key and other credentials needed to access metrics (e.g. mysql, postgresql metrics).
The following ports are open for normal operations:
- forwarder tcp/17123 for normal operations and tcp/17124 if graphite support is turned on
- dogstatsd udp/8125
All listening processes are bound by default to 127.0.0.1 and/or ::1 on v 3.4.1 and greater of the agent. In earlier versions, they were bound to 0.0.0.0 (i.e. all interfaces).
Make sure you have a number of open file descriptors high enough (1024 recommended).
You can see this value with the command
If you happen to have a hard limitation below the recommended value (Shell Fork Bomb Protection, etc.) one solution is to add the following in superisord.conf:
minfds = 100 # Your hard limit
This is where all standard metrics are gathered, every 15 seconds.
The collector also supports the execution of python-based, user-provided checks, stored in /etc/dd-agent/checks.d. User-provided checks must inherit from the AgentCheck abstract class defined in checks/init.py.
The forwarder listens over HTTP for incoming requests to buffer and forward over HTTPS to Datadog HQ. Buffering allows for network splits to not affect metric reporting. Metrics will be buffered in memory until a limit in size or number of outstanding requests to send is reached. Afterwards the oldest metrics will be discarded to keep the forwarder's memory footprint manageable.
DogStatsD is a python implementation of etsy's statsD metric aggregation daemon. It is used to receive and roll up arbitrary metrics over UDP, thus allowing custom code to be instrumented without adding latency to the mix.
Learn more about dogstatsd.
To understand the value of using the Datadog agent, reference the following articles: