Data Engineer III
Bucharest, Bucharest, Romania
Onapsis
Level-up your SAP cybersecurity with Onapsis--trusted by SAP and recognized by Gartner. Protect & optimize your landscape.About the job
The world’s most critical--and at risk--business applications have been neglected for far too long. Onapsis eliminates this blind spot by providing cybersecurity solutions dedicated to business-critical applications. Whether running on premises, in the cloud, or in a hybrid environment, Onapsis helps nearly 30% of the Forbes Global 100 understand the threats and risks across their SAP and Oracle landscapes.
We are seeking a Data Engineer III to join our mission-driven team. This role is ideal for experienced data engineers with a proven track record in architecting scalable data pipelines, leveraging cloud technologies, and contributing to high-impact cybersecurity solutions. You will be responsible for building high-performance ETL frameworks, optimizing data platforms, and contributing directly to the enhancement of our customers' threat detection, response, and remediation capabilities.
What you will be doing, your legacy:
You will be working directly with company Principal Engineers evaluating, scoping, proposing, and building features to fulfill business solution requirements to protect our customers. You will play a direct role in laying the technical foundation for a new product offering. Additionally, you will be working with Engineering and DevOps to deliver high-quality products and services while also working closely with security and IT professionals to ensure safe and secure best practices are followed.
Responsibilities:
- Architect and Design Scalable Data Solutions: Design/develop/maintain Data lakehouse solutions (Iceberg/Delta Lake /Hudi) applying industry best practices and structuring / optimizing the data according to data access patterns.
- Data Pipeline Development: Implement ETL/ELT pipelines using cloud technologies (Spark / pySpark / Glue, Kinesis Streams / Iceberg) to load the data into a Lakehouse for both efficient ML processing and UI reporting.
- Implement data models and data processing frameworks (Spark, Kafka, Snowflake) to ingest, transform, and load large datasets into Data Lakehouse techs (Apache Iceberg, Apache Delta Lake or Apache Hudi), ensuring high availability and reliability of data.
- Advanced Data Integration: Develop solutions that integrate multiple data sources into Snowflake or similar data warehouses to enable real-time analytics and reporting across dashboards.
- AI/ML Integration: Collaborate with cross-functional teams to co-develop AI-driven features identifying patterns and anomalies in client data using AI/ML technologies (python).
- Compliance and Security: Ensure compliance with industry standards and secure best practices (SOX, SOC 1/2), by implementing data governance frameworks, monitoring data pipelines, and optimizing cloud database architectures to protect sensitive information.
- Stakeholder Collaboration: Work closely with stakeholders, including analysts, engineers, and product managers, to understand their data needs, propose solutions, and drive data-driven decision-making by delivering actionable insights.
- Data Infrastructure Monitoring: Continuously monitor, troubleshoot, and enhance data pipelines, leveraging CI/CD tools (Docker, Jenkins, GitHub Actions) and orchestrating workflows using Apache Airflow to maintain operational efficiency.
- Leadership and Mentorship: Provide hands-on mentorship and technical guidance to junior engineers, including code reviews and architecture discussions.
- Documentation and Governance: Establish comprehensive documentation for data architecture, governance, and processes to ensure scalability, compliance, and security.
Qualifications:
- 3+ years of proven experience as a Data Engineer or in a similar role with a deep understanding of data architecture and cloud-based ETL/ELT frameworks.
- Strong experience with AWS (preferably) or Azure , particularly with Glue, EMR, S3, Lambda. Databricks , Snowflake , Synapse experience is a bonus.
- Proficiency in big data technologies such as Apache Spark, Kafka, Hadoop, and Databricks for distributed data processing.
- Proficiency with Python libraries for data processing and ML (e.g., Pandas, NumPy, Polars, Scikit-learn, PyTorch, TensorFlow).
- Hands-on experience in building real-time data processing and AI/ML-driven analytics solutions (SageMaker, Bedrock, NLP, Power BI).
- Ability to architect and manage data lakehouse solutions (Iceberg / Delta Lake / Hudi) or classic warehouse solutions (Redshift , Snowflake).
- Familiarity with compliance and audit requirements (SOX, SOC 1/2, GDPR) and implementing data governance and security frameworks.
- Strong problem-solving skills with a focus on data integrity, scalability, and performance optimization.
- Experience with CI/CD tools (Jenkins, GitHub Actions, Docker) and data orchestration platforms (Apache Airflow).
Preferred Qualifications:
- Experience with advanced data architecture principles (medallion architecture, materialized views, task scheduling).
- Experience using BI tools (e.g., Power BI, Tableau) for real-time analytics and operational reporting
What we offer:
- A role in shaping the future of protecting the most critical applications that run the world's business and a career that grows as the company grows.
- A unique culture of high achievement and teamwork.
- Supportive and humble colleagues are the space's top problem solvers and innovators.
- Financial security through competitive compensation and incentives.
Location: Onapsis is establishing a new development center in Bucharest. This is a hybrid role (1-2 days per week from the office), so candidates must be commutable to Bucharest every week.
About Onapsis:
Onapsis protects the business applications that run the global economy. The Onapsis Platform delivers vulnerability management, change assurance, and continuous compliance for business applications from leading vendors such as SAP, Oracle, and others. The Onapsis Platform is powered by the Onapsis Research Labs, the team responsible for the discovery and mitigation of more than 1,000 zero-day vulnerabilities in business applications.
Onapsis is headquartered in Boston, MA, with offices in Heidelberg, Germany and Buenos Aires, Argentina, and proudly serves hundreds of the world’s leading brands, including close to 30% of the Forbes Global 100, six of the top 10 automotive companies, five of the top 10 chemical companies, four of the top 10 technology companies, and three of the top 10 oil and gas companies.
For more information, connect with Onapsis on LinkedIn or visit https://www.onapsis.com.
#LI-AC1
#LI-Hybrid
* Salary range is an estimate based on our InfoSec / Cybersecurity Salary Index 💰
Tags: Analytics AWS Azure Big Data Business Intelligence CI/CD Cloud Compliance Databricks DevOps Docker GDPR GitHub Governance Jenkins Kafka Lambda Monitoring NLP Oracle Python S3 SAP Snowflake SOC SOC 1 SOX Threat detection Vulnerabilities Vulnerability management Zero-day
Perks/benefits: Career development Competitive pay
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.