The talent and dedication of our people are critical to our success. We offer an opportunity for developing one’s professional career while working with individuals trained in a variety of disciplines in a collegial and dynamic environment. We also offer a broad range of competitive benefits on a global basis.
- Hiring Department/Group: Public Cloud Infrastructure
- Job Title: Cloud Data Engineer
- Office Location: New York, NY
The successful candidate will be a Data Engineer with experience designing and building data analytics platforms on cloud based infrastructure. The role will be focused on building data pipelines that transform and persist data for various analytics use cases, while ensuring the completeness, consistency, and security of the data. The data sets will include structured and unstructured data, including everything from time series data, metadata, and text.
Design and implementation of large scale data platforms in cloud based environments for ongoing production analytics is required.
The candidate should have experience with cloud-native data tools on AWS (Kinesis, Glue, Redshift, Athena, Lambda, EMR) and/or Google (Pub/Sub, Dataflow, BigTable, BigQuery, Dataproc), as well as with open source platforms such as NiFi, Kafka, Flume, Hadoop, Spark, and Hive.
The candidate must have experience developing production ready code to perform data transformations and basic analytics, in one or more programming languages, to include Python. Experience working with numerical, scientific, and machine learning libraries is desired.
Experience with text based analytics including basic NLP techniques (tokenization, stemming, NER, etc.) is a strong plus. Experience with Lucene based search engines is required with a preference for Elasticsearch.
The candidate should have experience persisting data in multiple forms for different types of analysis. Experience transforming and persisting data to relational, various forms of NoSQL, and graph data stores, is strongly desired as is experience working with unstructured data.
Experience with machine learning, leveraging public cloud platforms and APIs or open source frameworks is a strong plus. Preferably including the building of platforms to support the systematic training and testing of machine learning models.
Experience working collaboratively with a team and ensuring code review, testing, and automation is implemented, with a focus on being able to continuously integrate and deploy updates to the platform without impacting its resiliency or consistency.
- Design large scale data analytics platforms
- Design data pipelines and persistence in a cloud environment
- Build solutions to process structured and unstructured data from multiple sources
- Work with teams to build data catalogs and track lineage
- Build solutions for OLTP and OLAP
- Mange ETL at scale with open source and cloud native tools
- Design schemas and normalization strategies based on the queries performed
- Work with Lucene based search engines, with a preference for Elasticsearch
- Work closely with software developers, data scientists, and Quantitative analysts
- Design and implement highly available, scalable, and encrypted cloud based storage solutions
- Manage, monitor, and operate production platforms
- Hands-on oversight of development work for the data analytics platforms
- Extensive experience managing data engineering for production environments
- Experience designing, building, and automating cloud environments
- Persisting data on Cloud native platforms that optimizes for performance, resiliency, and cost
- Building Data pipelines, transformations, and catalogs using cloud native and open source tools
- Experience programming against cloud platform APIs
- Experience with cloud infrastructure templating tools such as CloudFormation
- Building elastically scalable environments that leverage horizontal or vertical scaling
- Developing cost optimization using preemptible, spot, or reserved instances
- Experience developing collaboratively, including infrastructure as code
- Excellent written and verbal communications with an ability to summarize and translate between business and technical contexts
- Excellent troubleshooting and analytical skills
- Self-starter able to execute independently, with light supervision
Degree preferred in a STEM or related field
|Job Category||Full Time|