Labcorp Data Engineer in Durham, North Carolina
Labcorp is recruiting a qualified Data Engineer to join our Informatics Team in RTP, NC. This is an ideal job if you are an engineer who wants to be part of an highly skilled team, values total ownership of your work, and can't imagine a day without coding. Most importantly, your work will help deliver on Labcorp's mission to help improve lives and improve health through diagnostics testing and the development of new and innovative medicines and therapeutics. We're looking for a creative, focused, technically curious individual who enjoys both architecture as well as working hands-on with the code in a fast-paced startup culture. If you are a skilled developer, with professional experience in modern data pipelines, we want to speak to you! Healthcare experience is a big plus. You will be building data mining, research and patient recruitment tools for researchers and drug development experts with a focus on oncology and genetics. The most important skill for this role is being a good listener. You will need to understand the end-users needs and challenges to help build data pipelines that meet or exceed the client expectations.
Key Responsibilities include:
Designs, implements, tests, and reviews data pipelines involving multiple technologies, including PySpark, SQL and orchestration tools such as Apache Airflow
Proficient in all aspects of enterprise SDLC and follows with minimal oversight best practices relating to: peer review techniques, principles of software composition, branching, deployment, and documentation
Understands the best patterns, frameworks, and techniques for Data Engineering and can recognize where the codebase deviates from established coding standards
Actively communicates daily progress on a cross-functional team. Requires minimal intervention from the team lead to coordinate with other members of the team
License/Certification/Education: Normally requires a B.S. Degree in Computer Science w/6+ years of experience.
Qualifying Requirements, in order of importance:
4+ years of experience in design and implementation of complex solutions
PySpark or ○ Apache Spark and Python
Big Data. Must have experience with large datasets and hands-on performance tuning of distributed workloads
Solid computer science fundamentals
PySpark on CDH and AWS
Proficiency with data structure, algorithm analysis, and concurrency
UML or other diagraming skills
Education / Training:
- Bachelor's Degree in Engineering or Computer Science, equivalent work experience can be substituted for a degree
Nice to have skills:
As an EOE/AA employer, the organization will not discriminate in its employment practices due to an applicant's race, color, religion, sex, national origin, sexual orientation, gender identity, disability or veteran status.