Skip to content

Darcy.rayner/dd trace support #53

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Apr 3, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 34 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Datadog Lambda Layer for Python (2.7, 3.6, 3.7 and 3.8) enables custom metric su

## IMPORTANT NOTE

AWS Lambda is expected to recieve a [breaking change](https://aws.amazon.com/blogs/compute/upcoming-changes-to-the-python-sdk-in-aws-lambda/) on **January 30, 2021**. If you are using Datadog Python Lambda layer version 7 or below, please upgrade to version 11.
AWS Lambda is expected to recieve a [breaking change](https://aws.amazon.com/blogs/compute/upcoming-changes-to-the-python-sdk-in-aws-lambda/) on **January 30, 2021**. If you are using Datadog Python Lambda layer version 7 or below, please upgrade to version 11.

## Installation

Expand All @@ -21,6 +21,7 @@ arn:aws:lambda:<AWS_REGION>:464622532012:layer:Datadog-<PYTHON_RUNTIME>:<VERSION
```

Replace `<AWS_REGION>` with the AWS region where your Lambda function is published to. Replace `<PYTHON_RUNTIME>` with one of the following that matches your Lambda's Python runtime:

- `Datadog-Python27`
- `Datadog-Python36`
- `Datadog-Python37`
Expand Down Expand Up @@ -81,7 +82,7 @@ If `DD_FLUSH_TO_LOG` is set to false (not recommended), the Datadog API Key must
- DD_KMS_API_KEY - the KMS-encrypted API Key, requires the `kms:Decrypt` permission
- DD_API_KEY_SECRET_ARN - the Secret ARN to fetch API Key from the Secrets Manager, requires the `secretsmanager:GetSecretValue` permission (and `kms:Decrypt` if using a customer managed CMK)

You can also supply or override the API key at runtime (not recommended):
You can also supply or override the API key at runtime (not recommended):

```python
# Override DD API Key after importing datadog_lambda packages
Expand Down Expand Up @@ -243,6 +244,7 @@ If your Lambda function is triggered by API Gateway via [the non-proxy integrati
If your Lambda function is deployed by the Serverless Framework, such a mapping template gets created by default.

## Log and Trace Correlations

By default, the Datadog trace id gets automatically injected into the logs for correlation, if using the standard python `logging` library.

If you use a custom logger handler to log in json, you can inject the ids using the helper function `get_correlation_ids` [manually](https://docs.datadoghq.com/tracing/connect_logs_and_traces/?tab=python#manual-trace-id-injection).
Expand All @@ -265,6 +267,36 @@ def lambda_handler(event, context):
})
```

## Datadog Tracer (**Experimental**)

You can now trace Lambda functions using Datadog APM's tracing libraries ([dd-trace-py](https://github.com/DataDog/dd-trace-py)).

1. If you are using the Lambda layer, upgrade it to at least version 15.
1. If you are using the pip package `datadog-lambda-python`, upgrade it to at least version `v2.15.0`.
1. Install (or update to) the latest version of [Datadog forwarder Lambda function](https://docs.datadoghq.com/integrations/amazon_web_services/?tab=allpermissions#set-up-the-datadog-lambda-function). Ensure the trace forwarding layer is attached to the forwarder, e.g., ARN for Python 2.7 `arn:aws:lambda:<AWS_REGION>:464622532012:layer:Datadog-Trace-Forwarder-Python27:4`.
1. Set the environment variable `DD_TRACE_ENABLED` to true on your function.
1. Instrument your function using `dd-trace`.

```py
from datadog_lambda.metric import lambda_metric
from datadog_lambda.wrapper import datadog_lambda_wrapper

from ddtrace import tracer

@datadog_lambda_wrapper
def hello(event, context):
return {
"statusCode": 200,
"body": get_message()
}

@tracer.wrap()
def get_message():
return "hello world"
```

You can also use `dd-trace` and the X-Ray tracer together and merge the traces into one, using the environment variable `DD_MERGE_XRAY_TRACES` to true on your function.

## Opening Issues

If you encounter a bug with this package, we want to hear about it. Before opening a new issue, search the existing issues to avoid duplicates.
Expand Down
2 changes: 1 addition & 1 deletion datadog_lambda/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# The minor version corresponds to the Lambda layer version.
# E.g.,, version 0.5.0 gets packaged into layer version 5.
__version__ = "2.14.0"
__version__ = "2.15.0"


import os
Expand Down
9 changes: 9 additions & 0 deletions datadog_lambda/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,12 @@ class XraySubsegment(object):
NAME = "datadog-metadata"
KEY = "trace"
NAMESPACE = "datadog"


# TraceContextSource of datadog context. The DD_MERGE_XRAY_TRACES
# feature uses this to determine when to use X-Ray as the parent
# trace.
class TraceContextSource(object):
XRAY = "xray"
EVENT = "event"
DDTRACE = "ddtrace"
147 changes: 119 additions & 28 deletions datadog_lambda/tracing.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,26 @@
# Copyright 2019 Datadog, Inc.

import logging
import os

from aws_xray_sdk.core import xray_recorder
from aws_xray_sdk.core.lambda_launcher import LambdaContext

from ddtrace import patch, tracer
from datadog_lambda.constants import SamplingPriority, TraceHeader, XraySubsegment
from datadog_lambda.constants import (
SamplingPriority,
TraceHeader,
XraySubsegment,
TraceContextSource,
)
from ddtrace import tracer, patch
from ddtrace.propagation.http import HTTPPropagator

logger = logging.getLogger(__name__)

dd_trace_context = {}
dd_tracing_enabled = os.environ.get("DD_TRACE_ENABLED", "false").lower() == "true"

propagator = HTTPPropagator()


def _convert_xray_trace_id(xray_trace_id):
Expand Down Expand Up @@ -41,6 +51,43 @@ def _convert_xray_sampling(xray_sampled):
)


def _get_xray_trace_context():
if not is_lambda_context():
return None

xray_trace_entity = xray_recorder.get_trace_entity() # xray (sub)segment
return {
"trace-id": _convert_xray_trace_id(xray_trace_entity.trace_id),
"parent-id": _convert_xray_entity_id(xray_trace_entity.id),
"sampling-priority": _convert_xray_sampling(xray_trace_entity.sampled),
"source": TraceContextSource.XRAY,
}


def _get_dd_trace_py_context():
span = tracer.current_span()
if not span:
return None

parent_id = span.context.span_id
trace_id = span.context.trace_id
sampling_priority = span.context.sampling_priority
return {
"parent-id": str(parent_id),
"trace-id": str(trace_id),
"sampling-priority": str(sampling_priority),
"source": TraceContextSource.DDTRACE,
}


def _context_obj_to_headers(obj):
return {
TraceHeader.TRACE_ID: str(obj.get("trace-id")),
TraceHeader.PARENT_ID: str(obj.get("parent-id")),
TraceHeader.SAMPLING_PRIORITY: str(obj.get("sampling-priority")),
}


def extract_dd_trace_context(event):
"""
Extract Datadog trace context from the Lambda `event` object.
Expand All @@ -61,23 +108,24 @@ def extract_dd_trace_context(event):
sampling_priority = lowercase_headers.get(TraceHeader.SAMPLING_PRIORITY)
if trace_id and parent_id and sampling_priority:
logger.debug("Extracted Datadog trace context from headers")
dd_trace_context = {
metadata = {
"trace-id": trace_id,
"parent-id": parent_id,
"sampling-priority": sampling_priority,
}
xray_recorder.begin_subsegment(XraySubsegment.NAME)
subsegment = xray_recorder.current_subsegment()
subsegment.put_metadata(
XraySubsegment.KEY, dd_trace_context, XraySubsegment.NAMESPACE
)

subsegment.put_metadata(XraySubsegment.KEY, metadata, XraySubsegment.NAMESPACE)
dd_trace_context = metadata.copy()
dd_trace_context["source"] = TraceContextSource.EVENT
xray_recorder.end_subsegment()
else:
# AWS Lambda runtime caches global variables between invocations,
# reset to avoid using the context from the last invocation.
dd_trace_context = {}

dd_trace_context = _get_xray_trace_context()
logger.debug("extracted dd trace context %s", dd_trace_context)
return dd_trace_context


def get_dd_trace_context():
Expand All @@ -86,32 +134,38 @@ def get_dd_trace_context():

If the Lambda function is invoked by a Datadog-traced service, a Datadog
trace context may already exist, and it should be used. Otherwise, use the
current X-Ray trace entity.
current X-Ray trace entity, or the dd-trace-py context if DD_TRACE_ENABLED is true.

Most of widely-used HTTP clients are patched to inject the context
automatically, but this function can be used to manually inject the trace
context to an outgoing request.
"""
if not is_lambda_context():
logger.debug("get_dd_trace_context is only supported in LambdaContext")
return {}

global dd_trace_context
xray_trace_entity = xray_recorder.get_trace_entity() # xray (sub)segment
if dd_trace_context:
return {
TraceHeader.TRACE_ID: dd_trace_context["trace-id"],
TraceHeader.PARENT_ID: _convert_xray_entity_id(xray_trace_entity.id),
TraceHeader.SAMPLING_PRIORITY: dd_trace_context["sampling-priority"],
}
else:
return {
TraceHeader.TRACE_ID: _convert_xray_trace_id(xray_trace_entity.trace_id),
TraceHeader.PARENT_ID: _convert_xray_entity_id(xray_trace_entity.id),
TraceHeader.SAMPLING_PRIORITY: _convert_xray_sampling(
xray_trace_entity.sampled
),
}

context = None
xray_context = None

try:
xray_context = _get_xray_trace_context() # xray (sub)segment
except Exception as e:
logger.debug(
"get_dd_trace_context couldn't read from segment from x-ray, with error %s"
% e
)

if xray_context and not dd_trace_context:
context = xray_context
elif xray_context and dd_trace_context:
context = dd_trace_context.copy()
context["parent-id"] = xray_context["parent-id"]

if dd_tracing_enabled:
dd_trace_py_context = _get_dd_trace_py_context()
if dd_trace_py_context is not None:
logger.debug("get_dd_trace_context using dd-trace context")
context = dd_trace_py_context

return _context_obj_to_headers(context) if context is not None else {}


def set_correlation_ids():
Expand All @@ -125,6 +179,9 @@ def set_correlation_ids():
if not is_lambda_context():
logger.debug("set_correlation_ids is only supported in LambdaContext")
return
if dd_tracing_enabled:
logger.debug("using ddtrace implementation for spans")
return

context = get_dd_trace_context()

Expand Down Expand Up @@ -167,3 +224,37 @@ def is_lambda_context():
regular `Context` (e.g., when testing lambda functions locally).
"""
return type(xray_recorder.context) == LambdaContext


def set_dd_trace_py_root(trace_context, merge_xray_traces):
if trace_context["source"] == TraceContextSource.EVENT or merge_xray_traces:
headers = get_dd_trace_context()
span_context = propagator.extract(headers)
tracer.context_provider.activate(span_context)


def create_function_execution_span(
context, function_name, is_cold_start, trace_context
):
tags = {}
if context:
tags = {
"cold_start": str(is_cold_start).lower(),
"function_arn": context.invoked_function_arn,
"request_id": context.aws_request_id,
"resource_names": context.function_name,
}
source = trace_context["source"]
if source != TraceContextSource.DDTRACE:
tags["_dd.parent_source"] = source

args = {
"service": "aws.lambda",
"resource": function_name,
"span_type": "serverless",
}
tracer.set_tags({"_dd.origin": "lambda"})
span = tracer.trace("aws.lambda", **args)
if span:
span.set_tags(tags)
return span
37 changes: 29 additions & 8 deletions datadog_lambda/wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
import logging
import traceback

from datadog_lambda.cold_start import set_cold_start
from datadog_lambda.cold_start import set_cold_start, is_cold_start
from datadog_lambda.metric import (
lambda_stats,
submit_invocations_metric,
Expand All @@ -16,10 +16,13 @@
from datadog_lambda.patch import patch_all
from datadog_lambda.tracing import (
extract_dd_trace_context,
set_correlation_ids,
inject_correlation_ids,
dd_tracing_enabled,
set_correlation_ids,
set_dd_trace_py_root,
create_function_execution_span,
)

from ddtrace import patch_all as patch_all_dd

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -81,13 +84,21 @@ def __init__(self, func):
self.logs_injection = (
os.environ.get("DD_LOGS_INJECTION", "true").lower() == "true"
)
self.merge_xray_traces = (
os.environ.get("DD_MERGE_XRAY_TRACES", "false").lower() == "true"
)
self.function_name = os.environ.get("AWS_LAMBDA_FUNCTION_NAME", "function")

# Inject trace correlation ids to logs
if self.logs_injection:
inject_correlation_ids()

# Patch HTTP clients to propagate Datadog trace context
patch_all()
if not dd_tracing_enabled:
# When using dd_trace_py it will patch all the http clients for us,
# Patch HTTP clients to propagate Datadog trace context
patch_all()
else:
patch_all_dd()
logger.debug("datadog_lambda_wrapper initialized")
except Exception:
traceback.print_exc()
Expand All @@ -105,19 +116,29 @@ def __call__(self, event, context, **kwargs):

def _before(self, event, context):
try:

set_cold_start()
submit_invocations_metric(context)
# Extract Datadog trace context from incoming requests
extract_dd_trace_context(event)
dd_context = extract_dd_trace_context(event)

self.span = None
if dd_tracing_enabled:
set_dd_trace_py_root(dd_context, self.merge_xray_traces)
self.span = create_function_execution_span(
context, self.function_name, is_cold_start(), dd_context
)
else:
set_correlation_ids()

# Set log correlation ids using extracted trace context
set_correlation_ids()
logger.debug("datadog_lambda_wrapper _before() done")
except Exception:
traceback.print_exc()

def _after(self, event, context):
try:
if self.span:
self.span.finish()
if not self.flush_to_log:
lambda_stats.flush(float("inf"))
logger.debug("datadog_lambda_wrapper _after() done")
Expand Down
5 changes: 5 additions & 0 deletions scripts/publish_staging.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/bash
set -e

./scripts/build_layers.sh
./scripts/publish_layers.sh us-east-1
Loading