Every developer knows the anxiety of holding the pager, the notification that means drop everything and open your work laptop.
I’m looking forward to a future where we can give this responsibility to an AI as a first line of defence. When the pager goes off, we have the ability to first send all the information that would be sent to the on call engineer to an AI Agent. This agent would be able to see the problem and call tools like:
- inspect error logs
- view latency graphs
- view api metrics
Then it can write up a detailed report with links to what it found in the logs and monitoring tools. Giving the on call engineer a head start on remediating the issue.
If you are interested in working on this with me lets build.