Description
Use case
Original discussion: aws-powertools/powertools-lambda-typescript#3278
Copying @dreamorosi's comment to add more context to this feature request:
Had a discussion about this with @am29d yesterday and I thought it'd be useful to bring here the outcomes of the discussion, primarily for future implementation & consideration.
The log sampling feature changes the log level of a Logger to
debug
for a percentage of requests. Customers can set a rate in the Logger constructor and this rate is used to determine whether or not to change the log level todebug
. This is useful for customers who want to keep a less verbose log level most of the time, but have more logs emitted for a percentage of their requests.
As it stands, the feature doesn't exactly behaves as described above. This is because the percentage or ratio is not calculated at the request level, but rather when the Logger class is instantiated, which is usually during the
INIT
phase of the execution environment, i.e.
from aws_lambda_powertools import Logger
logger = Logger(sampling_rate=0.5) # whether or not the log level is switched to `debug` is decided here only
def handler(event, context):
# ... your logic here
pass
This means that all the requests served by the same environment will inherit the sampling decision that was made when the environment was initialized, which in turn results in a sampling rate different than the desired one. The degree of this difference will depend on how many environments are spun up and the distribution of requests among them.
To explain what I mean by that, let's consider this example that has 3 environments/sandboxes and a number of requests distributed across them, and - for the sake of simplicity - a log sampling of 0.5 (aka 50%):
Assuming a truly random chance of 50%, one could end up in a situation like the above, which would result in a sample rate of ~85% rather than the expected 50%.
To get around this, and instead get a more closer rate to the desired 50%, customers can use the
logger.refresh_sample_rate_calculation()
method mentioned above at the start/end of each request.
from aws_lambda_powertools import Logger
logger = Logger(sampling_rate=0.5) # whether or not the log level is switched to `debug` is decided here only
def handler(event, context):
# ... your logic here
logger.refresh_sample_rate_calculation()
pass
When called, this method essentially flips the coin again and decides whether the log level should be switched to
debug
or not. Because this is done at the request level, statistically speaking, the ratio of sampled requests should be much closer to the desired one:
With this in mind, we should consider easing this situation for customers by adding an optional flag to our class method decorator and Middy.js middleware so that when this flag is enabled, we'll call the
logger.refresh_sample_rate_calculation()
method for them at the end of each request, as proposed here.
The flag would be
false
by default to maintain backward compatibility, although in a future major version we could consider making it enabled by default since this would be a much accurate behavior than the current one.
Obviously, as mentioned above, this would work only if we're able to wrap the handler, so customers who are not using either of the two mechanisms just mentioned would have to continue calling the
logger.refresh_sample_rate_calculation()
manually.
Solution/User Experience
We want to have this experience:
1 - Customers will continue to set sampling_rate
at the constructor level.
2 - Customers using the @logger.inject_lambda_context
decorator will observe the sampling rate being recalculated on every request and having the expected result.
3 - Customers not using the decorator must call refresh_sample_rate_calculation
manually.
4 - Customers not using the decorator or the refresh_sample_rate_calculation
method will end up with unexpected sampling rates/logs.
5 - We need to change our documentation to make it more clear.
Alternative solutions
Acknowledgment
- This feature request meets Powertools for AWS Lambda (Python) Tenets
- Should this be considered in other Powertools for AWS Lambda languages? i.e. Java, TypeScript, and .NET
Metadata
Metadata
Assignees
Labels
Type
Projects
Status