Rapidly developed a horizontally scalable solution using the Kubernetes Python API to
improve the speed of data exports by >95%, ensuring deadlines for data drops were met.
Created an AWS SQS messaging application to run web crawlers on demand or at regular
intervals using a scalable Kubernetes deployment. Integrated with a slack bot via SNS to
inform users of metrics or warn of broken crawlers.
Integrated custom made and off the shelf machine learning models into the regulatory
document pipeline using AWS SageMaker to run predictions and translations on text input.
Developed web crawlers using the Scrapy python framework to scrape pdf and html content
from regulatory bodies across the globe.
Utilised Terraform and AWS CloudFormation to deploy infrastructure as code across sandbox,
staging and production stacks.
Implemented a solution to augment processed documents with human annotations via an EKS
cronjob, S3 and the label studio python SDK.
Surfaced cost and usage reports via Athena to provide senior leadership with a view of AWS
costs broken down by service, function, team and more, made possible by resource tagging.
Created CLI utilities to standardise CI/CD, testing, packaging, linting and more across projects.
Utilised tools such as SonarQube to help identify bugs, code smells and security risks.
Incorporated DataDog and Sentry into cloud applications to ensure alerting, monitoring and
logging were provided across all stacks and services.
Practiced test driven development, predominantly driven through PyTest and Jest.