đ„Building Audit Logging for Multi-Platform AI Bots with Python, AWS Cloudwatchđ„
aka, who is actually using our bots today?
This blog series focuses on presenting complex DevOps projects as simple and approachable via plain language and lots of pictures. You can do it!
These articles are supported by readers, please consider subscribing to support me writing more of these articles <3 :)
Hey all!
Weâve built three AI bots across our organization: VeraSlack, VeraTeams, and VeraResearch. All three run on AWS Bedrock using Claude Sonnet. They answer questions, search knowledge bases, and help employees find information.
When we first deployed these bots, we relied on AWS Bedrockâs built-in logging. Every API call generates a log entry with timestamps and request IDs. Our Lambda functions logged to CloudWatch. We figured we had full visibility. If security needed to audit a conversation, theyâd just look at the logs.
The first time our security team asked âWhat did Bob from Finance ask the bot last Tuesday?â we couldnât answer it. We had thousands of Bedrock API log entries, but no way to connect them to Bobâs question.
Agentic bots donât make one API call per user question. They make dozens.
An employee asks: âWhatâs our process for requesting SSL certificates?â That single question triggers the bot to query the knowledge base, call a reranking service, search Confluence, check PagerDuty, maybe query Jira, synthesize the results with Bedrock, and format a response. Thatâs fifteen separate API calls across multiple services.
Each API call logs separately. Each log entry is atomicâjust a request ID, timestamp, and parameters. No user name. No original question. No conversation context.
To reconstruct what Bob asked, youâd need to find his username in Slackâs event logs, correlate that to a Lambda execution timestamp, connect it to fifteen different Bedrock API calls scattered across log groups, and piece together the conversation.
Pro tip: This is awful, I donât want to spend all my time reading logs
The problem gets worse with conversational bots. VeraResearch uses Bedrock Agents and can have multi-turn conversations. A five-minute troubleshooting session generates hundreds of log entries. Bedrock logs API requests, but has no concept of âuser,â âconversation,â or âsession.â
We needed to answer basic questions: What is Bob asking? What information are we providing about sensitive topics? Which bot is being used most? Bedrockâs logging couldnât answer these.
So we built our own audit logging system. One that captures user identity, the original question, conversation context, the botâs response, and ties it together with a single session ID.
In this article, Iâll walk you through why platform logging fails for agentic bots, how we designed an audit system across Slack and Teams, and the platform-specific challenges we solved.
Keep reading with a 7-day free trial
Subscribe to Let's Do DevOps to keep reading this post and get 7 days of free access to the full post archives.



