Senior Software Engineer
New
IrelandFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Experience
- 5+ years
- Required Skills
- AWSPythonSQLApache AirflowElasticSearchKubernetesSpark
Requirements
- 5+ years of professional experience in Python development, particularly in web scraping and data pipeline systems at scale.
- Strong experience working with REST APIs and processing structured and unstructured data formats (including PDFs and OCR tools like Tesseract or PyMuPDF).
- Solid understanding of search and data technologies such as ElasticSearch/OpenSearch and relational or NoSQL databases.
- Hands-on experience with distributed processing frameworks such as Apache Airflow and Spark (EMR or equivalent).
- Strong problem-solving skills, especially in handling anti-scraping mechanisms, scaling challenges, and data complexity.
- Experience working in cloud environments such as AWS or GCP.
- Good understanding of system design principles for scalable and resilient backend systems.
- Familiarity with Kubernetes and containerized deployments is a plus.
- Exposure to ML/NLP concepts, LLMs, or frameworks such as spaCy, Hugging Face, TensorFlow, or PyTorch is an advantage.
Responsibilities
- Design and build distributed web crawling and data extraction systems capable of operating at scale in complex environments.
- Develop robust data pipelines to extract, process, and normalize data from web pages, APIs, PDFs, and other document formats.
- Build and maintain systems for unifying heterogeneous data into structured, consistent schemas for downstream use.
- Implement preprocessing and transformation logic to support ML/NLP models, classification systems, and search indexing.
- Develop APIs and services that expose structured data through ElasticSearch/OpenSearch.
- Collaborate with ML and data science teams to integrate classification models into production pipelines.
- Automate workflows using tools such as Apache Airflow and deploy scalable systems using Kubernetes and AWS.
- Optimize and scale data processing pipelines using distributed computing frameworks such as Spark (EMR).
View Full Description & ApplyYou'll be redirected to the employer's site