Apply📍 LatAm
🧭 Full-Time
🔍 E-Learning
🏢 Company: Truelogic👥 101-250ConsultingWeb DevelopmentWeb DesignSoftware
- 1-3 years of experience working with PySpark and Apache Spark in Big Data environments.
- Experience with SQL and relational and NoSQL databases (PostgreSQL, MySQL, MongoDB, etc.).
- Knowledge of ETL processes and data processing in distributed environments.
- Familiarity with Apache Hadoop, Hive, or Delta Lake.
- Experience with cloud storage (AWS S3, Google Cloud Storage, Azure Blob).
- Proficiency in Git and version control.
- Strong problem-solving skills and a proactive attitude.
- A passion for learning and continuous improvement.
- Design, develop, and optimize data pipelines using PySpark and Apache Spark.
- Integrate and process data from multiple sources (databases, APIs, files, streaming).
- Implement efficient data transformations for Big Data in distributed environments.
- Optimize code to improve performance, scalability, and efficiency in data processing.
- Collaborate with Data Science, BI, and DevOps teams to ensure seamless integration.
- Monitor and debug data processes to ensure quality and reliability.
- Apply best practices in data engineering and maintain clear documentation.
- Stay up to date with the latest trends in Big Data and distributed computing.
PostgreSQLSQLApache HadoopCloud ComputingETLGitMongoDBMySQLApache Kafka
Posted 6 days ago
Apply