AWS Lambda: Convert logs to metrics using Cloudwatch embedded metrics

Published on: Fri Sep 02 2022

Series

Goals

What we’re going to be doing:

  • ✅ Update our logger to convert logs to metrics for error tracking

  • ✅ Work with the client library aws-embedded-metrics for Node.js

  • ✅ View and track the errors on AWS console - Cloudwatch

Content

Introduction

This guide will be focused on demostrating how to convert AWS Cloudwatch logs to metrics using AWS Cloudwatch embedded metrics with Node.js.

To make it easier, we will be using the client library provided by AWS to make this happen.

Please use this repository - build-a-webhook-microservice-error-handling as a starting point.

What we will be doing is extending the functionality of our error handling by adding embedded metrics to track the errors.

That way we can see the number of errors we are experiencing in our AWS lambda function.

This can be very useful if we ever want to do any form of alerting or establish error budgets for our system.

Let’s dive right in!

Installation

First things first, we will need AWS’s embedded metrics client library for Node.js, so let’s install that using pnpm.

The package provides an intuitive API for us to format our logs which eventually gets convert to metrics.

1. Install aws-embedded-metrics package

cd functions/ingestion && \
     pnpm install aws-embedded-metrics

Configuration

Once the client library is installed, let’s start create a new utility for our metrics.

1. Create new folders and file

mkdir -p src/utils/metrics \
&& touch src/utils/metrics/send-error-metrics.ts

2. Fill out the send error metrics

The examples on the AWS respository for aws-embedded-metrics-node.

Ideally, we should use metrics.setProperty('operation', ...) for the other properties since this will be logged by the client library.

Add the utility:

// src/utils/metrics/send-error-metrics.ts

import { createMetricsLogger, Unit } from "aws-embedded-metrics";

import { ErrorLogDetails } from '@app/types';

const sendErrorMetrics = async (
  properties: Record<string, any> 
): Promise<void> => {
  const metrics = createMetricsLogger();
  // Add custom k/v of properties into the embedded metric logs
  for (let key in properties) {
    metrics.setProperty(key, properties[key]);
  }
  metrics.putDimensions({ Service: "Aggregator" });
  metrics.putMetric("Error", 1, Unit.Count);
  metrics.setNamespace("Webhook-Service");
  await metrics.flush();
};

export default sendErrorMetrics;
💡 A few notes:
  • Dimension - This is important for aggregation purposes on Cloudwatch metrics

    • Think of them as “categories” for your metrics
  • Property - These are properties that get included into the Cloudwatch logs but is not part of the metrics

3. Adding requestId

Now that we have sendErrorMetrics , we can archive our existing logger implementation.

CloudWatch embedded metrics are still logs after all, the only difference is metrics are created from them.

Adding the requestId will make it easier for us to search, query and index our logs.

the changes required:

// src/utils/metrics/send-error-metrics.ts

import { createMetricsLogger, Unit } from "aws-embedded-metrics";
import { ErrorLogDetails } from '@app/types';
import asyncLocalStorage from '@app/utils/async-local-storage';

const sendErrorMetrics = async (
  properties: Record<string, any> 
): Promise<void> => {
  const requestId: string = asyncLocalStorage.getStore().get('awsRequestId');
  const metrics = createMetricsLogger();
  metrics.setProperty('requestId', requestId);
  // Add custom k/v of properties into the embedded metric logs
  for (let key in properties) {
    metrics.setProperty(key, properties[key]);
  }
  metrics.putDimensions({ Service: "Aggregator" });
  metrics.putMetric("Error", 1, Unit.Count);
  metrics.setNamespace("Webhook-Service");
  await metrics.flush();
};

export default sendErrorMetrics;

4. Create type for the error log details

This will be the typescript type used in our code for our error log details.

// types.ts

export interface ErrorLogDetails {
  // The error message 
  message: string;
  // The error response sent to the client
  clientResponse: any;
  // [Optional] The specific operation called 
  operation?: string;
  // [Optional] Key values of conext
  context?: Record<string, any>;
}

5. Create an utility for tracking errors in our lambda

This will be the actual utility used by our global error handler to capture the error.

In cases where sendErrorMetrics fails, we will fallback to our default logger.

Create new utility file:

touch src/utils/capture-error.ts

Add the logic for the utility:

// src/utils/capture-error.ts

import {
  APIGatewayProxyResult
} from 'aws-lambda';
import { CommonError } from '@app/errors';
import { ServiceError, LogDetails } from '@app/types';
import { sendErrorMetrics } from '@app/utils';

/**
*
* An utility function that captures the errors through logging and metrics
*
*/
async function captureError(
  error: ServiceError | Error,
  response: APIGatewayProxyResult,
): Promise<void> {
  let logDetails: LogDetails = {
    clientResponse: response,
    message: error.message
  };
  if (error instanceof CommonError) {
    logDetails.operation = error.operation;
    logDetails.context = error.context;
  }
  try {
    await sendErrorMetrics(logDetails);
  } catch (err: any) {
    // fallback to default logger
    logger.error({
      operation: 'captureError#sendErrorMetrics',
      message: err.message,
      context: {
        logDetails
      }
    });
  }
}

export default captureError;

6. Add tests for captureError utility [Optional]

Expand Details

Create new file:

touch src/utils/__tests__/capture-error.test.ts

Add tests:

// src/utils/__tests__/capture-error.test.ts

import captureError from '@app/utils/capture-error';
import { VerifySignatureError } from '@app/errors';
import { LogDetails } from '@app/types';
import sendErrorMetrics from '@app/utils/metrics/send-error-metrics.ts';
import logger from '@app/services/logger';

jest.mock('@app/utils/metrics/send-error-metrics.ts', () => ({
  __esModule: true,
  default: jest.fn(),
}));

jest.mock('@app/services/logger', () => ({
  __esModule: true,
  default: {
    error: jest.fn(),
  }
}));

describe('utils/captureError', () => {
  beforeEach(() => {
    jest.resetAllMocks();
    jest.clearAllMocks();
  });

  describe('sendErrorMetrics', () => {
    it('should call metrics function with the correct log details', async() => {
      const error = new Error('mock error');
      const response = {
        statusCode: 500,
        body: JSON.stringify({
          message: error.message,
        })
      };
      await captureError(error, response);
      expect(sendErrorMetrics).toBeCalledWith({
        message: error.message,
        clientResponse: response,
      });
    });
    it('should include other log metadata when it exists (operation, context)', async() => {
      const error = new VerifySignatureError('mock error')
        .setOperation('mockOperation')
        .setContext({
          a: 'a',
          b: 'b',
        });
      const response = {
        statusCode: 500,
        body: JSON.stringify({
          message: error.message,
        })
      };
      await captureError(error, response);
      expect(sendErrorMetrics).toBeCalledWith({
        context: {
          a: 'a',
          b: 'b',
        },
        operation: 'mockOperation',
        message: error.message,
        clientResponse: response,
      });
    });
  });
  describe('fallback: default logger', () => {
    it('should call the logger with the correct log details', async() => {
      sendErrorMetrics.mockImplementation(() => {
        throw new Error('sendErrorMetrics failed');
      });
      const error = new Error('mock error');
      const response = {
        statusCode: 500,
        body: JSON.stringify({
          message: error.message,
        })
      };
      await captureError(error, response);
      expect(logger.error).toBeCalledWith({
        operation: 'captureError#sendErrorMetrics',
        context: {
          logDetails: {
            clientResponse: response,
            message: error.message,
          },
        }
      });
    });
    it('should include other log metadata when it exists (operation, context)', async() => {
      sendErrorMetrics.mockImplementation(() => {
        throw new Error('sendErrorMetrics failed');
      });
      const error = new VerifySignatureError('mock error')
        .setOperation('mockOperation')
        .setContext({
          a: 'a',
          b: 'b',
        });
      const response = {
        statusCode: 500,
        body: JSON.stringify({
          message: error.message,
        })
      };
      await captureError(error, response);
      expect(logger.error).toBeCalledWith({
        operation: 'captureError#sendErrorMetrics',
        context: {
          logDetails: {
            context: {
              a: 'a',
              b: 'b',
            },
            operation: 'mockOperation',
            clientResponse: response,
            message: error.message,
          },
        }
      });
    });
  });
});

7. Export our utilities in the utils index file

// src/utils/index.ts

export { default as verifySignature } from './verify-signature';
export { default as getSqsMessage } from './get-sqs-message';
export { default as handleError } from './handle-error';
export { default as captureError } from './capture-error';
export { default as sendErrorMetrics } from './metrics/send-error-metrics';

8. Integrating into the global error handler

Finally, now we have our utility, let’s integrate that into the global error handler.

// src/utils/handle-error.ts

import {
  APIGatewayProxyResult
} from 'aws-lambda';

import {
  CommonError,
  AwsSqsServiceError,
  VerifySignatureError,
} from '@app/errors';

import logger from '@app/services/logger';
import asyncLocalStorage from '@app/utils/async-local-storage';
import { captureError } from '@app/utils';

import { ServiceError } from '@app/types';

export default async function handleError(
  error: ServiceError | Error
) : Promise<APIGatewayProxyResult> {
  const requestId: string = asyncLocalStorage.getStore().get('awsRequestId');
  const response : any = {
    statusCode: 500,
    body: {
      errorTrackingId: requestId,
      message: 'Something went wrong',
      errors: []
    },
  };
  switch (error.constructor.name) {
    // Authentication failure or signature mis-match
    case VerifySignatureError.name:
      response.statusCode = 401;
      break;
    // SQS error - server error
    case AwsSqsServiceError.name:
      break;
    default:
      break;
  }
  response.body.message = error.message;
  response.body.errors.push(error.message);
  response.body = JSON.stringify(response.body);
  // Logging & Metrics
  await captureError(error, response);
  return response;
}

9. Tweak the lambda function

Since we converted our handleError to be an async function, we will need to update our code in the index.ts .

This should be a very minor change!

import {
  APIGatewayProxyEvent,
  APIGatewayProxyResult
} from 'aws-lambda';

import { sendMessage } from '@app/services/sqs-service';
import {
  handleError,
  verifySignature,
} from '@app/utils';

import { captureRequestContext } from '@app/utils/async-local-storage';

export const handler = async(
  event: APIGatewayProxyEvent
): Promise<APIGatewayProxyResult> => {
  captureRequestContext(event);
  let messageId = '';
  try {
    // 1. Verify Signature
    verifySignature(event);
    // 2. Add to Queue
    messageId = await sendMessage(event);
  } catch (err: any) {
    // 3. Error handling (final touch)
    const errorResponse: APIGatewayProxyResult = await handleError(err);
    return errorResponse;
  }

  // 4. Response
  return {
    statusCode: 200,
    body: JSON.stringify({
      messageId,
      message: 'success'
    }),
  }
}

10. Apply the infrastructure

Now that the code is ready to go, let’s apply the infrastructure and test it out!

Run the following:

// This will re-generate the assets
pnpm run generate-assets --filter @function/*

export AWS_ACCESS_KEY_ID=<your-key>
export AWS_SECRET_ACCESS_KEY=<your-secret>
export AWS_DEFAULT_REGION=us-east-1

terraform init
terraform plan
terraform apply -auto-approve

11. Test out the infrastructure

This should cause the API to return with 401 because the signature verification will fail.

Now, we can go to the AWS console and check custom embedded cloudwatch error metrics.

curl -X POST "<api_endpoint>/webhooks/receive" \
-H "Content-Type: application/json" \
--data-raw '{"data": "test"}'

12. Verify Metrics on AWS console

If it all went as expected, you should see the following:

Illustrating custom cloudwatch embedded metrics on AWS console
Illustration of custom Cloudwatch embedded metrics on AWS console

13. Clean up

When you are done, destory the infrastructure so you won’t be charged for having it running!

terraform destroy -auto-approve

Conclusion

By using cloudwatch embedded metrics in our Node.js infrastructure, we are now able to emit cloudwatch logs that get converted cloudwatch metrics.

To make it easier, AWS also provides us with client libraries (aws-embedded-metrics) to create these logs.

This is a very powerful and useful capability offered out of the box when using AWS lambda.

⭐️ However, when using it, just keep in mind you will incur extra cost for the cloudwatch metrics on top of the logs. ⭐️

I hope you found learned something new and found this helpful!

If you you did, please share this article with a friend or co-worker 🙏❤️! (Thanks!)


Enjoy the content ?

Then consider signing up to get notified when new content arrives!

Jerry Chang 2022. All rights reserved.