We are looking for a Data Engineer II to help build the next generation of our data ecosystem as part of our Data Platform team. The Data Platform team acts as stewards for our’ data. We manage the data ecosystem from technical infrastructure to data architecture and tactical data transformations to unlock data as strategic asset in powering high quality decision making at scale. Our services support both internal & external decision making from product development and client insights to ultimately supporting the development and operationalization of our Healthcare AI to improve the lives of patients and doctors across the country.
As a Data Engineer II, you will build and maintain significant components of the our data architecture for our Periop solution. You will be comfortable working closely with both core data users on analytics and data science and backend partners to design, build, and deploy comprehensive data pipelines in support of internal and external data solutions - from data requirement & quality definitions, to pipeline design, to feature engineering within the healthcare space (and HIPAA restrictions). You will be motivated and excited to have an impact on the team and in the company and to improve the quality of healthcare operations.
- Build, tune, and improve the end-to-end workflow of data users to data driven decisions & product features (incl. designing data structures, building and scheduling data transformation pipelines, improving access to critical data assets etc.).
- Improve the overall definition and quality of our data assets incl. the automation and management of schema lifecycles, improved tooling for data flow visibility, and data quality tooling/monitoring
- Support the optimization and scale of existing pipelines, dashboards, and data science feature engineering to support our scale
- Collaborate with data infrastructure and engineering teams to build, extend, and envision cross-platform ETL and reports generation frameworks to raise engineering standards.
- 3+ years of relevant experience
- Demonstrated experience in leading data modeling & transformation pipeline design for production services and data science / analytics partner needs (strong SQL & DBT experience preferred)
- Working, hands on, knowledge of modern Data Warehouses & ETL tools (Snowflake, Airflow) and experience optimizing & designing the data processing for modern data visualization tools (Looker preferred)
- Experience analyzing data to identify deliverables, gaps, and inconsistencies to ultimately refine dataset requirements and standards
- Professional experience working with modern programming languages such as Python (preferred), Java, Go, C++ and a dedication to high code quality.
Nice to Have
- Strong cross-functional communication - ability to break down complex technical components for technical and non-technical partners alike.
- Practical hands on experience with data lake / schema discovery systems and relevant technologies (Confluent/Kafka, Spark, Athena/Glue, DynamoDB etc.)
- Familiarity with the AWS ecosystem (Lambda, RDS, Redshift etc.)
- Experience handling unstructured data, especially working with NLP tooling to extract information from text
- Bachelor’s degree in Computer Science, Engineering, Analytics, or related field, or equivalent training / experience