Sumit Kumar Rai
Kathmandu, Nepal
@gmail.com
+977-98
Professional Summary
Sumit is a Senior Data Engineer with 10 years of IT experience. He specializes solving for US Healthcare Insurance and Workforce solution challenges using his diverse skill set and expertise. He excels in building enterprise data warehouses, automating workflows, writing software & scripts, and overcoming architectural & scalability obstacles. His role as a Data, DevOps, and Software Engineer enables him to face technical hurdles and achieve business objectives with top-notch results.
Skills
PYTHON · DBT · SQL · SNOWFLAKE · DATABRICKS · ETL / ELT · PYSPARK · AWS · DOCKER CONTAINERIZATION · CI/CD
Experience
Techkraft Inc., Lalitpur, Nepal
Senior Data Engineer
Jan 2023 - Present
- Managed a team of six highly skilled Software Engineers, providing leadership and guidance in understanding business objectives, system architectures, workflows, and technical challenges, while collaborating daily with Nepal, India, and US team members to ensure seamless coordination across different time zones and cultural contexts.
- Streamlined and consolidated insurance claims data from multiple US Healthcare Insurance organizations, enhancing data usability and enabling efficient report generation.
- Architected and developed automated data pipelines on the Databricks Platform using PySpark or SparkSQL for seamless data extraction, loading, and transformation.
- Ensured timely delivery of project requirements and conducted thorough reviews to maintain high-quality standards.
- Facilitated cross-department communication and collaboration with Business Analyst (BA) and Quality Assurance (QA) teams, ensuring alignment and effective coordination throughout the project lifecycle.
- Mentored participants from Nepal and India in a US Healthcare Bootcamp for two months, providing guidance and expertise in healthcare data management and analytics.
DATABRICKS · DATABRICKS PIPELINE · SNOWFLAKE · PYSPARK · SPARK SQL · PYTHON · DATA ENGINEERING · DATA WAREHOUSING · DOCKER · CONTAINERIZATION · ANTLR4 · ETL / ELT · DBT · TERRAFORM · AMAZON WEB SERVICES · AWS S3 · OLAP / OLTP · GITLAB · PANDAS · EXCEL · JSON / YAML · PARQUET · REST APIS · JIRA · SCRUM
CloudFactory, Kathmandu, Nepal
Senior Data Engineer
Dec 2020 - Sep 2022
- Collaborated with teams in Nepal, Kenya, and the UK to achieve data team goals and make ourselves successful.
- Successfully migrated data pipelines from Xplenty to Prefect orchestration, improving cost-effectiveness.
- Developed Python and Pandas pipelines for fetching and ingesting data from clients' REST APIs.
- Provided crucial support to the data team, overseeing technical operations of the enterprise data warehouse. Facilitated cross-departmental understanding of business operations, processes, and data origins.
- Implemented documentation and testing protocols, resulting in an 80% improvement in team performance.
- Supported the establishment of a self-service BI platform with accurate data delivery and user-friendly documentation.
- Developed dbt transformations following the Kimball approach for complete, accurate, and timely data processing.
- Created a CI pipeline in GitHub Actions, ensuring code quality checks, model validation, and data testing.
- Maintained data team standards and data accuracy.
- Orchestrated data pipelines using Prefect, transitioning from EC2 to enhance visibility and optimize resource allocation in the AWS Fargate environment.
SNOWFLAKE · DBT · FIVETRAN · PYTHON · DATA ENGINEERING · DATA WAREHOUSING · ENTERPRISE DATA WAREHOUSE · KIMBALL · DOCKER · CONTAINERIZATION · ETL / ELT · PREFECT · TERRAFORM · AMAZON WEB SERVICES · AWS S3 · AWS ECS · AWS ECR · AWS EC2 · OLAP / OLTP · GITHUB · GITHUB ACTIONS · XPLENTY · STITCH · PIPELINEWISE · SINGER · PANDAS · EXCEL · REST APIS · JIRA · SCRUM · KANBAN
CloudFactory, Kathmandu, Nepal
Software Engineer, DevOps
Oct 2018 - Nov 2020
- Improved availability and stability of a critical communication microservice application, recognized and rewarded for achieving performance increase from 95% to 99%.
- Implemented a game-changing optimization, reducing EC2 Auto Scaling instance startup time by 70%.
- Replaced Ansible scripts with HashiCorp Packer for faster instance provisioning using custom AWS AMIs built with AWS CodeBuild.
- Developed optimized SQL queries for processing big data, collaborating with data scientists on PySpark and AWS Glue jobs.
- Orchestrated AWS Glue jobs and catalogs using AWS Step Functions and stored raw data in PostgreSQL for further processing and categorization.
- Troubleshot and resolved issues across multiple applications in development, test, and production environments.
- Monitored application performance proactively, optimizing operation and making necessary enhancements.
- Led the upgrade of legacy infrastructure to align with evolving business operations and leverage advanced AWS services.
- Migrated 200-300 instances to AWS VPC environment, enhancing security and internal connectivity.
- Collaborated with software engineering teams to achieve Scrum sprint goals, fostering effective teamwork and alignment.
ANSIBLE · AMAZON WEB SERVICES · AWS ATHENA · AWS CLOUDFORMATION · AWS S3 · AWS IAM · AWS ECS · AWS ECR · AWS EC2 · AWS ELB · AWS RDS · AWS GLUE · AWS STEP FUNCTIONS · GITHUB · GITHUB ACTIONS · REST APIS · SSH · JENKINS · POSTGRESQL · SQL · MONGODB · RABBITMQ · DOCKER · CONTAINERIZATION · HASHICORP TERRAFORM · HASHICORP PACKER · NGINX · APACHE2 · BURP SUITE · JIRA · SCRUM
Leapfrog Technology, Kathmandu, Nepal
Software Engineer, DevOps
Sep 2016 - Sep 2018
- Developed Python and Ansible scripts to efficiently configure EC2 infrastructures, streamlining the deployment process and ensuring consistent configuration management.
- Implemented Jenkins pipelines to automate the deployment of source codes, enabling seamless and efficient deployment workflows.
- Designed and structured AWS resources utilizing Auto Scaling and Elastic Load Balancing through CloudFormation.
- Configured and resolved issues related to these resources to optimize performance and scalability.
- Deployed and configured pfSense firewall for the intranet, including traffic shaping for optimal working environments.
- Configured additional security measures such as VPN (OpenVPN), Snort IPS, and Squid web filter to enhance network security.
- Engineered, implemented, and monitored comprehensive security measures to safeguard computer systems, networks, and sensitive information, ensuring the integrity and confidentiality of data.
- Established FreeIPA and Active Directory identity managers to centralize employee identities and manage access permissions across networks and servers, enhancing security and simplifying user management processes.
AMAZON WEB SERVICES · AWS CLOUDFORMATION · AWS S3 · AWS IAM · AWS EC2 · AWS ELB · AWS RDS · AWS ROUTE53 · ANSIBLE · GITHUB · REST APIS · SSH · JENKINS · SQL · DOCKER · CONTAINERIZATION · HASHICORP TERRAFORM · NGINX · APACHE2 · BURP SUITE · SSL MANAGEMENT · PFSENSE · OPENVPN · FREEIPA · SOPHOS · SCRUM
Incessant Rain Animation Studios, Kathmandu, Nepal
Software Developer
Dec 2014 - Sep 2016
- Designed and maintained file pipelines for seamless file transfers across departments, supporting animators in accessing and uploading necessary files.
- Collaborated with the .NET programming team to automate render job handling, resulting in a streamlined rendering process with improved efficiency and reduced manual intervention.
- Developed a customized remote services manager tool for the rendering department, simplifying render job management and increasing productivity while minimizing administrative overhead.
PYTHON · REST APIS · MICROSOFT SQL SERVER · SQL · REGEX
Education
London Metropolitan University, London, UK
Masters of Science in IT & Applied Security
Sep 2019 – Jun 2022
London Metropolitan University, London, UK
Bachelors of Science (Hons) in Computer Networking and IT Security
Feb 2012 – Nov 2014
Informatics Academy, Victoria Street, Singapore
International Diploma in Information and Communication Technology
Apr 2011 – Dec 2011