Experience with AWS services including S3, Athena, EC2, EMR, and Glue
Ability to solve ongoing issues with operating the cluster
Experience with the integration of data from multiple data sources
Experience with various database technologies such as SQLServer, Redshift, Postgres, and RDS
Experience with one or more of the following data integration platforms: Pentaho Kettle, SnapLogic, Talend OpenStudio, Jitterbit, Informatica PowerCenter, or similar
Knowledge of best practices and IT operations in an always-up, always-available service
Experience with or knowledge of Agile Software Development methodologies
Excellent problem-solving and troubleshooting skills
Excellent oral and written communication skills with a keen sense of customer service
Experience with collecting/managing/reporting on large data stores
Awareness of Data governance and data quality principles
Well-versed in Business Analytics including basic metric building and troubleshooting
Understand Integration architecture: application integration and data flow diagrams, source-to-target mappings, data dictionary reports
Familiar with Web Services: XML, REST, SOAP
Experience with Git or similar version control software
Experience with integrations with and/or use of BI tools such as GoodData (preferred), Tableau, PowerBI, or similar
Broad experience with multiple RDBMS: MS SQLServer, Oracle, MySQL, PostgreSQL, Redshift
Familiarity with SaaS/cloud data systems (e.g. Salesforce)
Data warehouse design: star-schemas, change data capture, denormalization
SQL/DDL queries/Tuning techniques such as indexing, sorting, and distribution
BS or MS degree in Computer Science or a related technical field
3+ years of Data Pipeline development such as SnapLogic or Datastage, Informatica, or related experience
3+ years of SQL experience (No-SQL experience is a plus)
Experience designing, building, and maintaining data pipelines
Responsibilities:
Develop and maintain data models for core package application and reporting databases
Monitor execution and performance of daily pipelines, triage and escalate any issues
Collaborate with analytics and business teams to improve data models and data pipelines
Implement processes and systems to monitor data quality
Write unit/integration tests, contribute to engineering wiki, and document work
Perform data analysis required to troubleshoot data-related issues
Work within AWS/Linux cloud systems environment in support of data integration solution
Work closely with a team of frontend and backend engineers, product managers, and analysts
Collaborate with team members, share knowledge, provide visibility of accomplishments, and follow directions