Skip to content

Conversation

@kohyoungheon
Copy link
Contributor

@kohyoungheon kohyoungheon commented Dec 10, 2025

What does this PR do? What is the motivation?

https://datadoghq.atlassian.net/browse/TEEP-3046

  • Create new Container Log Collection Troubleshooting Doc
  • Create menu item for new doc
  • Update existing Docker/Kubernetes log collection docs to link to this new doc in further reading sections and troubleshooting sections
  • Remove old docker log collection troubleshooting guide

Merge instructions

Merge readiness:

  • Ready for merge

For Datadog employees:

Your branch name MUST follow the <name>/<description> convention and include the forward slash (/). Without this format, your pull request will not pass CI, the GitLab pipeline will not run, and you won't get a branch preview. Getting a branch preview makes it easier for us to check any issues with your PR, such as broken links.

If your branch doesn't follow this format, rename it or create a new branch and PR.

[6/5/2025] Merge queue has been disabled on the documentation repo. If you have write access to the repo, the PR has been reviewed by a Documentation team member, and all of the required checks have passed, you can use the Squash and Merge button to merge the PR. If you don't have write access, or you need help, reach out in the #documentation channel in Slack.

Additional notes

@kohyoungheon kohyoungheon requested a review from a team as a code owner December 10, 2025 19:27
@github-actions github-actions bot added Architecture Everything related to the Doc backend Images Images are added/removed with this PR Guide Content impacting a guide labels Dec 10, 2025
@buraizu buraizu self-assigned this Dec 10, 2025
@buraizu
Copy link
Contributor

buraizu commented Dec 10, 2025

Created DOCS-12883 for documentation team review

@buraizu buraizu added the editorial review Waiting on a more in-depth review label Dec 10, 2025
@buraizu buraizu removed their assignment Dec 10, 2025
@JacksonDavenport JacksonDavenport added the tap Issues created from the Tickets Analysis Platform (TAP) dashboard label Dec 11, 2025
@JacksonDavenport
Copy link
Contributor

Happy with the changes from the Containers TEE side 👍 , but can you also adjust this page:

This links to the Kubernetes log doc and its now removed troubleshooting section

@estherk15 estherk15 self-assigned this Dec 15, 2025
Copy link
Contributor

@estherk15 estherk15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for taking this on @kohyoungheon. Most of my feedback is to maintain docs style and clarify content. I also think the first section should be split off as a guide. It's high level and walks a reader through the why's of log collection and troubleshooting, rather than the how. Let me know if you have any questions on my comments!


#### Container collect all configuration

Consult the [Docker][1] and [Kubernetes][2] log collection docs for the full steps on how to enable log collection. For quick reference you can see samples on how to configure the Agent to enable log collection and enable the `container_collect_all` feature, which defaults to false.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Consult the [Docker][1] and [Kubernetes][2] log collection docs for the full steps on how to enable log collection. For quick reference you can see samples on how to configure the Agent to enable log collection and enable the `container_collect_all` feature, which defaults to false.
For comprehensive instructions on how to enable log collection, see the [Docker][1] and [Kubernetes][2] log collection documentation. For quick reference you can see samples on how to configure the Agent to enable log collection and enable the `container_collect_all` feature, which defaults to false.


#### Autodiscovery configuration

You can configure which containers the Agent collects logs from by using Autodiscovery configurations. Datadog recommends to configure this by using [container labels in Docker][6] or [Pod annotations in Kubernetes][7]. These are JSON based log configurations placed on the corresponding container/pod emitting those logs. You can see a minimal example below:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can configure which containers the Agent collects logs from by using Autodiscovery configurations. Datadog recommends to configure this by using [container labels in Docker][6] or [Pod annotations in Kubernetes][7]. These are JSON based log configurations placed on the corresponding container/pod emitting those logs. You can see a minimal example below:
Autodiscovery allows you to configure which containers the Agent collects logs from. Datadog recommends using [container labels in Docker][6] or [Pod annotations in Kubernetes][7]. These are JSON based log configurations placed on the corresponding container/pod emitting those logs. See the following minimal example:


### Tagging

The Agent automatically assigns tags to your logs at the “high” level of [tag cardinality][10] for each environment. You can view the out-of-the-box [Docker tags here][11] and [Kubernetes tags here][12]. This additionally includes and tags collected by [Unified Service Tagging][13] or different tag extraction rules from container metadata.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The Agent automatically assigns tags to your logs at the “high” level of [tag cardinality][10] for each environment. You can view the out-of-the-box [Docker tags here][11] and [Kubernetes tags here][12]. This additionally includes and tags collected by [Unified Service Tagging][13] or different tag extraction rules from container metadata.
The Agent automatically assigns tags to your logs at the “high” level of [tag cardinality][10] for each environment. You can view the out-of-the-box [Docker tags here][11] and [Kubernetes tags here][12]. This also includes any tags collected by [Unified Service Tagging][13] or different tag extraction rules from container metadata.

Comment on lines 170 to 172
To customize these tags, change the log collection rules, or enable log collection in general - you can apply Autodiscovery Labels or Annotations to the respective containers as documented above.

Tags on your logs can also come from [host tag inheritance][14]. All data, including logs, coming into Datadog goes through this process. On Datadog intake the logs will inherit all the host-level tags that are associated with that host. You can see these tags on the Infrastructure List for you host. These are most commonly set by:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To customize these tags, change the log collection rules, or enable log collection in general - you can apply Autodiscovery Labels or Annotations to the respective containers as documented above.
Tags on your logs can also come from [host tag inheritance][14]. All data, including logs, coming into Datadog goes through this process. On Datadog intake the logs will inherit all the host-level tags that are associated with that host. You can see these tags on the Infrastructure List for you host. These are most commonly set by:
To customize these tags, change the log collection rules, or enable log collection in general - you can apply Autodiscovery Labels or Annotations to the respective containers.
Tags on your logs can also come from [host tag inheritance][14]. All data, including logs, coming into Datadog goes through this process. On Datadog intake the logs inherit all the host-level tags that are associated with that host. You can see these tags on the Infrastructure List for you host. These are most commonly set by:

- The Datadog Agent and its automatic discovery or manual set of `DD_TAGS` provided
- The cloud provider integrations collecting and setting tags for your hosts

So for example the tags `pod_name` and `short_image` come from the Agent setting this tag on submission. Other tags like `region` and `kube_cluster_name` come from host tag inheritance on intake.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
So for example the tags `pod_name` and `short_image` come from the Agent setting this tag on submission. Other tags like `region` and `kube_cluster_name` come from host tag inheritance on intake.
For example, the tags `pod_name` and `short_image` come from the Agent setting this tag on submission. Other tags like `region` and `kube_cluster_name` come from host tag inheritance on intake.

Copy link
Contributor

@estherk15 estherk15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for taking this on @kohyoungheon. Most of my feedback is to maintain docs style and clarify content. I also think the first section should be split off as a guide. It's high level and walks a reader through the why's of log collection and troubleshooting, rather than the how. Let me know if you have any questions on my comments!

@kohyoungheon
Copy link
Contributor Author

Hi @estherk15 Thank you so much for reviewing this PR! I spoke to the containers TEE team about these particular suggestions:

  1. The first section should be split off as a guide. It's high level and walks a reader through the why's of log collection and troubleshooting, rather than the how.

We would like for this section to stay in the troubleshooting doc where it currently is. It's meant to summarize the Kubernetes, Docker, Tagging, Autodiscovery steps into their most basic parts. Then importantly build upon that in the following troubleshooting sections.

For example:

  • Providing you info on where the Log files are maintained and then later showing how to export them
  • Telling you the autodiscovery rules, then showing how the status and configcheck can show your logs are being tailed and what config is applied
    -Showing you the tagging rules, then showing how stream-logs can expose these tags and how hostname preprocessing can be affected

Separating them to their own page I feel would make this troubleshooting process less clear as you would have to switch between those pages.

  1. As for that other optional change of moving from tabs to table/description list, we would like to keep these as tabs for consistency between the chunky Autodiscovery configuration sections.

Thank you. Please let me know if you would like me to make any other changes!

@estherk15
Copy link
Contributor

@kohyoungheon Thanks for checking with the TEE team and sharing their reasoning. I agree that the context connects well to the later troubleshooting steps. My only concern is that when users land on a troubleshooting page, they’re usually trying to solve something right away, and a long introductory section can slow them down.

I still think the context is important, just possibly better positioned as prerequisite or collapsible content so it supports, rather than interrupts, troubleshooting. But I’m happy to work with whichever direction the team prefers.

@JacksonDavenport
Copy link
Contributor

Hey @estherk15 Young and I chatted some more about this. I think we'd prefer to keep the structure as is. If users are really in a rush they can use the menu on the right to skip right to the Troubleshooting commands and Common Issues sections. But ideally they do take the time to get the quick context on how everything works and take stock relative to the questions in those first sections. In that sense I'd agree that this is really a prerequisite step for your troubleshooting more than anything.

The alternative of putting this in a different doc feels like it would be cumbersome to navigate between the two for such a related topic. Having it as collapsible content in each of the troubleshooting commands sections may make those sections too wordy. Some of this context overlaps as well between the status, configcheck, and stream-logs output which might make that difficult.

We could change the parent headers like the following (or to whatever you'd prefer) if you think that might flow better than Log collection context.

## Overview
## Troubleshooting prerequisites
### Log files
### ...

## Troubleshooting methods
### Agent status
### ...

## Common issues
### ....

Copy link
Contributor

@estherk15 estherk15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kohyoungheon @JacksonDavenport. Since we're keeping the structure as it, I think it makes sense to leave all the information together as an intro to the actual troubleshooting steps. I left a couple more change suggestions, but this is good to go!

Apply suggested changes

Co-authored-by: Esther Kim <[email protected]>
@kohyoungheon kohyoungheon merged commit 262d35a into master Dec 18, 2025
17 checks passed
@kohyoungheon kohyoungheon deleted the young.koh/containers_log_troubleshooting branch December 18, 2025 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Architecture Everything related to the Doc backend editorial review Waiting on a more in-depth review Guide Content impacting a guide Images Images are added/removed with this PR tap Issues created from the Tickets Analysis Platform (TAP) dashboard

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants