Greetings

Hi there 🙋🏼‍♂️ Sneaking at my bio ?? -> Nice to meet you 🙇🏼

tldr;

  I l🫶ve️ to design virtual infrastructures
    🤹🏻 bring them up with text/yaml files
    🐕‍🦺 pet them as cattles
    🕵🏻‍♂️ watch them scale up and down
  so as to minimize the sad face of people
  ...building business around them.

Work Summary

10yrs+ operation journey

  • shipping application runtimes over AWS, GCP, Linode with iac
  • services instrumentation (over prometheus-grafana, datadog, scout, slack alerts)
  • site reliability high availability{ clustering, replica, scaling }
  • continuous delivery with build pipelines {github, jenkins, gitlab}
  • security certification {SOC2,ISO27k1}, hardening vms, adjusting firewalls, fighting ddos ⚔️
  • debugging rescuing production, exec graceful service rollouts
  • capacity planning & infrastructure provisioning

and sometime as developer

  • create developer tools(for easy servers access and deployments)
  • made service/apis (go,ruby), cli tools (bash/go) & web-ui (js)
  • build dapps(react-native) on ethereum blockchain with solidity/truffle

misc

  • mentor and help teams simplify operational chores
  • advocate 12factorapp development principle
And if you prefer YAML/JSON
intro.yaml
apiVersion: v1
kind: AboutMe
spec:
 labels:
   name: Milan Thapa
   site: https://thapakazi.com
 social:
   github: thapakazi
   twitter: thapakazi_
 xp:
   duration  : 87600h # >10+yrs
   languages : ["py","go", "rb"]
   cloud     : ["aws","gcp"]
   iac       : ["terrform"]
   containers: ["k8s","ecs"]

 expertise:
   - site reliability engineering
 exploring:
   - platform engineering

 hobbies:
   - strumming 🎸
   - futsal ⚽️
   - misc 🥾🏸🚴
intro.json
{
  "apiVersion": "v1/intro",
  "kind": "AboutMe",
  "spec": {
      "labels": {
          "name": "Milan Thapa",
          "site": "https://thapakazi.com"
      },
      "social": {
          "github": "thapakazi",
          "twitter": "thapakazi_",
          "linkedin": "thapakazi"
      },
      "xp": {
          "duration": "87600h",
          "languages": [
              "py",
              "go",
              "rb"
          ],
          "cloud": [
              "aws",
              "gcp"
          ],
          "iac": [
              "terrform"
          ],
          "containers": [
              "k8s",
              "ecs"
          ]
      },
      "expertise": [
          "site reliability engineering"
      ],
      "exploring": [
          "platform engineering"
      ],
      "hobbies": [
          "strumming 🎸",
          "futsal ⚽️",
          "misc 🥾🏸🚴"
      ]
  }

Projects

Skills and Expertise

proficiency knows
containers kubernetes 🩶(CKA/CKAD) ecs, docker-compose
iac terraform pulumi, packer, cloudFormation
automation ansible, capistrano
lib/frameworks react, rails, buffalo sinatra
languages golang💚, sh , ruby c, js, python, c
aws EKS, EC2, S3, IAM, RDS,… DynamoDB, ElastiCache
gcp k8s, codebuild,dbs,compute Storage/Buckets
database postgresql, mongo redis, mysql, percona tools
monitoring prometheus, sensu, datadog newrelic, +nagios+
ci/cd github/gitlab, jenkins travis, circleci
blockchain truffle ethereum:SmartContracts
typeset org-mode, md latex
editor emacs(😈-mode) vim, nano
system archlinux debians, +windows/mac+
speech नेपाली, English Hindi

Work Experience

Summary

CloudRickshaw

Founder && Senior DevOps Engineer
https://cloudrickshaw.com
2020-Present

  • Started a DevOps as Service offering with a team to help companies and startups streamline DevOps culture.
  • Worked with a global team to design, build and deploy banking applications, adhering to strict security and compliance requirements.
  • Provisioned infrastructures and environments with code from day one. Monitored security and operational requirements.
  • Mentored fresh minds to explore and learn DevOps principles. Provided training on Kubernetes.

Zenledger

DevSecOps/SRE
https://zenledger.io
2022-2024

  • Stack: RoR, Go, AWS, Terraform, Kubernetes, Grafana, Loki, Prometheus, Honeybadger, ScoutAPM, Datadog, Github

  • Interesting Chores:

    • performance improvement for api servers from 500-600ms to less than 200ms
    • capacity planning and revision to save more than 10k$/month
    • developed a custom proxy using Tinyproxy to scale outgoing connections and bypass third-party geo-restrictions and rate limits.
    • build cli tool to assist devs to get to rails console, fetch logs, scale up/down their pods
    • custom workflow for static wordpress builds of website
    • autoscaling of reader database to handle spikes
    • autoscaling of servers based on custom metrics(backlog on background jobs)
  • Tasks included:

    • bootstrapping all of the necessary infrastructures (introduce iac/terraform after rejoining)
      • preview environments on pull_request with argocd
    • chores:
      • help optimizing database operations
        • report system health issues due to db spikes, help devs deal with the query
        • assist devs to move read queries to reader, add autoscaling on db reader
      • monitor system health and security threats
      • streamlined GitHub Actions (parallel tests) and optimized self-hosted runners configurations.
      • guiding developers to build a robust, secure, and scalable platform.
      • documentation:
        • runbooks to handle incidents
        • all the missing infra components and deployments
    • worked to get compliance: SOC2 & ISO27001 certifications

Plantura

DevOps Contractor
https://www.plantura.garden/

  • Tech: Amazon Web Services (AWS), Terraform, GithubAction, Sentry, Grafana, Helm, Python
  • Assisted migration of the Odoo/ERP from odoo.sh to a self-hosted setup at AWS.
  • Helped the team improve the overall observability and performance of the system with graphs and logs.
  • Drafted testing/preview environments by duck-taping docker via github actions

Innovatetech

DevOps Contractor
https://innovatetech.io/
2020-2022

  • Introduced and mentored teammates for IAC with Terraform.
  • Streamlined the deployments to be simple and continuous.
  • Helped devs improve the health and performance of the system with the instrumentation of metrics, traces, and logs using Prometheus, Grafana, Elk, and APM.
  • Involved in non-functional chores like hiring and mentoring DevOps members, advocating documentation and diagrams, and 12factor app design principles.
  • Oversaw capacity planning for better resource utilization.
  • Defined a granular DevOps Roadmap and delegated ownership among teammates.
  • Performed general operational chores, specifically migrations, upgrades, rescue production, sharing postmortems, and improving reliability and security.

IRcon

  • Tech: AWS, Kubernetes, Terraform, BitBucket, Grafana, Prometheus, Helm, Java, Python
  • Worked with a global team to design, build and deploy banking applications, adhering to strict security and compliance requirements.
  • Provisioned infrastructures and environments with code from day one.
  • Monitored security and operational requirements.

Upwork

  • Kodekloud | https://kodekloud.com/
    • Had awesome experience with Mumshad creating questions on labs for ansible
  • Misc
    • Migrating Ruby app to AWS with improvements on security
    • Added pipeline for continuous deployment

Whitehat/Payable

2018-2019

  • Introduced and mentored teammates for IAC with Terraform.
    • Tech: AWS, Golang, Solidity, React Native, Mobile, Ethereum, Blockchain, Ruby on Rails (RoR), Go, APIs
    • Built a payment utility application bridging the crypto-legacy banking payment options.
    • Developed the back end (API & custom CLI client), handled the deployments, and managed the team effort for continuous delivery and improvements.
    • The solution was built on top of the Ethereum and Dwolla APIs.

Cloudfactory

DevOps Engineer
https://www.cloudfactory.com/
2013-2018

  • Tech: Amazon Web Services (AWS), Grafana, Prometheus, Ansible, Ruby on Rails (RoR)
  • Deployed a Ruby runtime on AWS.
  • Worked on capacity planning, and build solutions for cost reduction.
  • Setup a CI/CD pipeline for a faster, reliable delivery.
  • Monitored server/business metrics (over graphs, slack alerts).
  • Deployed services with increase availability via clustering, replication (HA).
  • Evangelized on better engineering principles/practices (12-factor app) across the team.
  • Executed smooth downtime/service migrations, VMS, patching, and building immutable infrastructures.
  • Developed bots to automate mundane tasks.
  • Supported engineering teams in debug development chores.

Education & Certifications

  • Bachelor’s in Computer Engineering | 2009-2013 @Kathmandu University
  • Certified Kubernetes Administrator | 2020-2023
  • Certified Kubernetes Application Developer | 2020-2023

Achievements

  • Open-source contributions.
  • Organize community meetups
  • Ncell App Camp 2014, Category Winner — Tourism | Nov 2014

Recent Itches

  • Building RAG with llama3.2
  • Explore Asahi

Address?

  • origin: I was born and raised in Nepal(NST, utc+5:45)
  • lives: and currently I live with my beautiful wife at Texas(CST, utc-6)
  • family: occasionally I visit my them at SanFrancisco(PST, utc-8)