Apply

Intermediate Data Engineer - OP01505

Posted 2024-11-07

View full description

πŸ’Ž Seniority level: Middle, 4+ years

πŸ“ Location: Poland, Bulgaria, Portugal

πŸ” Industry: Art market and blockchain technology

🏒 Company: Dev.Pro

πŸ—£οΈ Languages: English

⏳ Experience: 4+ years

πŸͺ„ Skills: PythonSoftware DevelopmentApache AirflowBlockchainElasticSearchETLJavascriptMachine LearningMongoDBMySQLTableauAirflowAlgorithmsCassandraData engineeringElasticsearchGrafanaPrometheusRDBMSNosqlJavaScript

Requirements:
  • 4+ years of experience in data engineering, encompassing data extraction, transformation, and migration.
  • Advanced experience with data extraction from unstructured files and legacy systems.
  • Proven expertise in migrating data from file-based storage systems to Google Cloud Platform.
  • Proficiency with relational databases, specifically MariaDB or MySQL, and cloud-native solutions like Google Cloud Storage, BigQuery.
  • Strong programming skills in Python, focusing on data manipulation and automation.
  • Extensive experience with ETL/ELT pipeline development and workflow orchestration tools.
  • Hands-on experience with batch processing frameworks and real-time data processing frameworks.
  • In-depth understanding of data modeling, warehousing, and scalable data architectures.
  • Practical experience in developing data mastering tools for data cleaning.
  • Expertise in RDBMS functionalities and ability to handle PII data.
Responsibilities:
  • Take full responsibility for the data warehouse and pipeline, including planning, coding, reviews, and delivery to production.
  • Migrate data from existing file storage systems to Google Cloud Platform.
  • Design, develop, and maintain ETL/ELT pipelines for data migration and integration.
  • Collaborate with team to re-implement custom data mastering tools for improved data cleaning.
  • Evaluate existing technology stack and provide recommendations for improvements.
  • Develop a new scraper system to extract and aggregate data from external sources.
  • Ensure integrity, consistency, and quality of data through optimized processes.
Apply