feat: use vendored version of requests package in layer. #546

purple4reina · 2024-12-19T21:32:06Z

What does this PR do?

Removes the requests package from the zipped lambda layer. Adds /path/to/pip/_vendor to sys.path so requests can instead be imported from pip._vendor.requests.

Motivation

Removing requests and all its dependencies from the layer reduces the package size (unzipped) from 15,212 KB to 13,904 KB -- a roughly 8.5% reduction!! (Data gathered running PYTHON_VERSION=3.12 ARCH=arm64 ./scripts/build_layers.sh)

Potential Drawbacks

Some customers may already be relying on the presence of the requests package in our python layer. This change could crash their lambda functions if requests is imported before datadog_lambda. This would only be the case if they are using the @datadog_lambda_wrapper decorator and not the automatic handler redirection using DD_LAMBDA_HANDLER.

For example, prior to this change, assuming a customer adds just our python layer and nothing else, this handler will fail to initialize.

import requests

from datadog_lambda.wrapper import datadog_lambda_wrapper

@datadog_lambda_wrapper
def handler(event, context):
    resp = requests.get('https://example.com')
    resp.raise_for_status()
    return 'ok'

Therefore, we absolutely must include something about the import order importance in the release notes.

Testing Guidelines

Additional Notes

If for some reason the importing of requests doesn't work (say pop._vendor.requests isn't installed on the container) then the only situation in which the lambda function would raise an exception is when attempting to submit metrics via the datadog api. We do this when either the metric has a post dated timestamp or there is no extension available. Therefore, a pretty narrow use case.

Types of Changes

Bug fix
New feature
Breaking change
Misc (docs, refactoring, dependency upgrade, etc.)

Check all that apply

This PR's description is comprehensive
This PR contains breaking changes that are documented in the description
This PR introduces new APIs or parameters that are documented and unlikely to change in the foreseeable future
This PR impacts documentation, and it has been updated (or a ticket has been logged)
This PR's changes are covered by the automated tests
This PR collects user input/sensitive content into Datadog
This PR passes the integration tests (ask a Datadog member to run the tests)

purple4reina · 2024-12-19T21:42:18Z

tests/integration/handle.py

@@ -21,6 +22,8 @@ def handle(event, context):

    # Generate custom metrics
    lambda_metric("hello.dog", 1, tags=["team:serverless", "role:hello"])
+    lambda_metric("hello.cat", 1, tags=["team:serverless", "role:hello"],
+        timestamp=time.time() - 60)


Creating a custom metric with a timestamp forces our instrumentation to use the datadog api package instead of sending the metric to the extension.

purple4reina · 2024-12-19T23:14:22Z

tests/integration/handle.py

    lambda_metric("hello.dog", 1, tags=["team:serverless", "role:hello"])
+    lambda_metric(
+        "hello.cat", 1, tags=["team:serverless", "role:hello"], timestamp=timestamp
+    )


Creating a custom metric with a timestamp forces our instrumentation to use the datadog api package instead of sending the metric to the extension.

datadog-datadog-prod-us1 · 2024-12-19T23:19:44Z

.github/workflows/check_dependencies.yml

@@ -0,0 +1,16 @@
+name: Check Dependencies


🟠 Code Vulnerability

No explicit permissions set for at the workflow level (...read more)

Datadog’s GitHub organization defines default permissions for the GITHUB_TOKEN to be restricted (contents:read, metadata:read, and packages:read).

Your repository may require a different setup, so consider defining permissions for each job following the least privilege principle to restrict the impact of a possible compromise.

You can find the list of all possible permissions in Workflow syntax for GitHub Actions - GitHub Docs. They can be defined at the job or the workflow level.

datadog-datadog-prod-us1 · 2024-12-19T23:19:44Z

.github/workflows/check_dependencies.yml

+  check:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v3


🟠 Code Vulnerability

Workflow depends on a GitHub actions pinned by tag (...read more)

When using a third party action, one needs to provide its GitHub path (owner/project) and can eventually pin it to a Git ref (a branch name, a Git tag, or a commit hash).

No pinned Git ref means the action uses the latest commit of the default branch each time it runs, eventually running newer versions of the code that were not audited by Datadog. Specifying a Git tag is better, but since they are not immutable, using a full length hash is recommended to make sure the action content is actually frozen to some reviewed state.

Be careful however, as even pinning an action by hash can be circumvented by attackers still. For instance, if an action relies on a Docker image which is itself not pinned to a digest, it becomes possible to alter its behaviour through the Docker image without actually changing its hash. You can learn more about this kind of attacks in Unpinnable Actions: How Malicious Code Can Sneak into Your GitHub Actions Workflows. Pinning actions by hash is still a good first line of defense against supply chain attacks.

Additionally, pinning by hash or tag means the action won’t benefit from newer version updates if any, including eventual security patches. Make sure to regularly check if newer versions for an action you use are available. For actions coming from a very trustworthy source, it can make sense to use a laxer pinning policy to benefit from updates as soon as possible.

datadog-datadog-prod-us1 · 2024-12-19T23:46:53Z

.github/workflows/check_dependencies.yml

+      - uses: actions/checkout@v3
+
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: actions/setup-python@v4


🟠 Code Vulnerability

Workflow depends on a GitHub actions pinned by tag (...read more)

When using a third party action, one needs to provide its GitHub path (owner/project) and can eventually pin it to a Git ref (a branch name, a Git tag, or a commit hash).

No pinned Git ref means the action uses the latest commit of the default branch each time it runs, eventually running newer versions of the code that were not audited by Datadog. Specifying a Git tag is better, but since they are not immutable, using a full length hash is recommended to make sure the action content is actually frozen to some reviewed state.

Be careful however, as even pinning an action by hash can be circumvented by attackers still. For instance, if an action relies on a Docker image which is itself not pinned to a digest, it becomes possible to alter its behaviour through the Docker image without actually changing its hash. You can learn more about this kind of attacks in Unpinnable Actions: How Malicious Code Can Sneak into Your GitHub Actions Workflows. Pinning actions by hash is still a good first line of defense against supply chain attacks.

Additionally, pinning by hash or tag means the action won’t benefit from newer version updates if any, including eventual security patches. Make sure to regularly check if newer versions for an action you use are available. For actions coming from a very trustworthy source, it can make sense to use a laxer pinning policy to benefit from updates as soon as possible.

purple4reina · 2024-12-20T21:32:48Z

We have tried this idea before, though using the requests version vendored with botocore. However there were major problems and we reverted. See

Since we do not have any control over the lambda runtime, if for some reason the requests version changes or goes away, we have little recourse in trying to solve it. There is potential that it could cause every lambda everywhere to start failing, which is an incredibly high risk, even if chances of it happening are very low.

purple4reina force-pushed the rey.abolofia/vendored-requests branch from 3b9cbac to 540c6ad Compare December 19, 2024 21:40

purple4reina commented Dec 19, 2024

View reviewed changes

purple4reina force-pushed the rey.abolofia/vendored-requests branch 2 times, most recently from 00f707b to aec1eb0 Compare December 19, 2024 22:05

Use vendored version of requests package in layer.

de9caa5

purple4reina force-pushed the rey.abolofia/vendored-requests branch from aec1eb0 to de9caa5 Compare December 19, 2024 22:13

purple4reina changed the title ~~Use vendored version of requests package in layer.~~ feat: Use vendored version of requests package in layer. Dec 19, 2024

purple4reina changed the title ~~feat: Use vendored version of requests package in layer.~~ feat: use vendored version of requests package in layer. Dec 19, 2024

purple4reina added 2 commits December 19, 2024 14:42

Update integration test snapshots.

69d6f8e

Ignore linting issue.

5ee2c8c

purple4reina commented Dec 19, 2024

View reviewed changes

datadog-datadog-prod-us1 bot reviewed Dec 19, 2024

View reviewed changes

purple4reina force-pushed the rey.abolofia/vendored-requests branch 9 times, most recently from 34d00fc to bd762c3 Compare December 19, 2024 23:46

datadog-datadog-prod-us1 bot reviewed Dec 19, 2024

View reviewed changes

purple4reina force-pushed the rey.abolofia/vendored-requests branch from bd762c3 to 4fc91d2 Compare December 19, 2024 23:49

Github action to ensure vendored requests version.

591509d

purple4reina force-pushed the rey.abolofia/vendored-requests branch from 4fc91d2 to 591509d Compare December 19, 2024 23:51

purple4reina marked this pull request as ready for review December 19, 2024 23:54

purple4reina requested a review from a team as a code owner December 19, 2024 23:54

purple4reina closed this Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: use vendored version of requests package in layer. #546

feat: use vendored version of requests package in layer. #546

Uh oh!

purple4reina commented Dec 19, 2024 •

edited

Loading

Uh oh!

purple4reina Dec 19, 2024

Uh oh!

purple4reina Dec 19, 2024

Uh oh!

datadog-datadog-prod-us1 bot Dec 19, 2024

Uh oh!

datadog-datadog-prod-us1 bot Dec 19, 2024

Uh oh!

datadog-datadog-prod-us1 bot Dec 19, 2024

Uh oh!

purple4reina commented Dec 20, 2024

Uh oh!

Uh oh!

feat: use vendored version of requests package in layer. #546

feat: use vendored version of requests package in layer. #546

Uh oh!

Conversation

purple4reina commented Dec 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Motivation

Potential Drawbacks

Testing Guidelines

Additional Notes

Types of Changes

Check all that apply

Uh oh!

purple4reina Dec 19, 2024

Choose a reason for hiding this comment

Uh oh!

purple4reina Dec 19, 2024

Choose a reason for hiding this comment

Uh oh!

datadog-datadog-prod-us1 bot Dec 19, 2024

Choose a reason for hiding this comment

🟠 Code Vulnerability

Uh oh!

datadog-datadog-prod-us1 bot Dec 19, 2024

Choose a reason for hiding this comment

🟠 Code Vulnerability

Uh oh!

datadog-datadog-prod-us1 bot Dec 19, 2024

Choose a reason for hiding this comment

🟠 Code Vulnerability

Uh oh!

purple4reina commented Dec 20, 2024

Uh oh!

Uh oh!

purple4reina commented Dec 19, 2024 •

edited

Loading