What is the Datadog Agent? What resources does it consume?

Introduction

The Datadog Agent is lightweight piece of software that runs on your hosts. Its job is to faithfully collect events and metrics and bring them to Datadog on your behalf so that you can do something useful with your monitoring and performance data.

The source code for the Datadog Agent can be found here.

 

 

Agent Architecture

The agent is composed of 4 major components, each written in Python running as a separate process:

  • Collector (agent.py) - The collector runs checks on the current machine for whatever integrations you have, it captures system metrics like memory and CPU.
  • Dogstatsd (dogstatsd.py) - This is a StatsD backend server, it's responsible for aggregating local metrics sent from your code
  • Forwarder (ddagent.py) - The forwarder is pushed data from both dogstatsd and the collector and queues it up to be sent to Datadog.
  • SupervisorD This is all controlled by a single supervisor process. We keep this separate so you don’t have to have the overhead of each application if you don’t want to run all parts (though we generally recommend you do).

To learn about extending agent checks or writing your own see here.

Note to Windows users: All four agent processes will appear as instances of ddagent.exe with the description "DevOps’ best friend".

Agent Overhead

In terms of resource consumption the Datadog agent consumes roughly:

  • Resident memory (actual RAM used): 50MB
  • CPU Runtime: less than 1% of averaged runtime
  • Disk:
    • Linux 120MB
    • Windows: 60MB
  • Network: 10-50 KB of bandwidth per minute

The stats listed above are based on an EC2 m1.large instance running for 10+ days.

Supervision, Privileges and Network Ports

A Supervisord master process runs as the dd-agent user, and all forked subprocesses run as the same user. This applies to any system call (`iostat`/`netstat`) initiated by the Datadog agent as well. The agent configuration resides at /etc/dd-agent/datadog.conf and /etc/dd-agent/conf.d. All configuration must be readable by dd-agent. The recommended permissions are 0600 since configuration files contain your API key and other credentials needed to access metrics (e.g. mysql, postgresql metrics).

The following ports are open for normal operations:

  • forwarder tcp/17123 for normal operations and tcp/17124 if graphite support is turned on
  • dogstatsd udp/8125

All listening processes are bound by default to 127.0.0.1 and/or ::1 on v 3.4.1 and greater of the agent. In earlier versions, they were bound to 0.0.0.0 (i.e. all interfaces).

For information on running the Agent through a proxy please see here; for which ranges to allow, see here.

Make sure you have a number of open file descriptors high enough (1024 recommended).

You can see this value with the command ulimit -a
If you happen to have a hard limitation below the recommended value (Shell Fork Bomb Protection, etc.) one solution is to add the following in superisord.conf:

[supervisord]

minfds = 100 # Your hard limit

 

The Collector

This is where all standard metrics are gathered, every 15 seconds.

The collector also supports the execution of python-based, user-provided checks, stored in /etc/dd-agent/checks.d. User-provided checks must inherit from the AgentCheck abstract class defined in checks/init.py.

The Forwarder

The forwarder listens over HTTP for incoming requests to buffer and forward over HTTPS to Datadog HQ. Buffering allows for network splits to not affect metric reporting. Metrics will be buffered in memory until a limit in size or number of outstanding requests to send is reached. Afterwards the oldest metrics will be discarded to keep the forwarder's memory footprint manageable.

DogStatsD

DogStatsD is a python implementation of etsy's statsD metric aggregation daemon. It is used to receive and roll up arbitrary metrics over UDP, thus allowing custom code to be instrumented without adding latency to the mix.

Learn more about dogstatsd.

Agent Benefits

To understand the value of using the Datadog agent, reference the following articles:

 

 

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.
Powered by Zendesk