Senior Data & Machine Learning Engineer
Sentra
Senior Data & Machine Learning Engineer
- R&D
- Tel Aviv Office
- Full-time
Description
Sentra is the global leader in cloud-native data security for the AI era. The company’s mission is to empower organizations to confidently scale their data operations across multi-cloud and on-premises environments while leveraging the power of AI without compromising security.
Sentra’s unique approach enables enterprises to autonomously scan their environments without the need for agents, ensuring that data remains securely within their cloud or on-premises infrastructure. This innovative methodology sets us apart in the industry, providing organizations with control and visibility over their sensitive data at all times. Our commitment to excellence in data security posture management and data detection and response makes Sentra a leader in the field.
About the Role
We are looking for a Senior Data & Machine Learning Engineer to operate at the intersection of data platform engineering and machine learning enablement. This role is responsible for building scalable, efficient, and reliable data systems while enabling Data Science and Analytics teams to develop and deploy ML-driven features.
You will take ownership of the data and ML infrastructure layer, ensuring that pipelines, storage models, and compute usage are optimized, while also shaping how data workflows and ML solutions are designed across the organization.
Responsibilities
Data Platform & Infrastructure
- Design, build, and maintain scalable data pipelines and storage systems supporting analytics and ML use cases
- Ensure compute and cost efficiency across pipelines, storage models, and processing workflows
- Own and improve data orchestration, transformation, and serving layers (e.g., Spark, DBT, streaming/batch systems)
- Build and maintain shared infrastructure components, including:
- IO managers and data access abstractions
- Integrations with DBT, Spark, and other data frameworks
- Internal tooling to improve developer productivity and reliability
ML Enablement & Collaboration
- Partner closely with Data Science to design and productions ML solutions for new features and research initiatives
- Translate experimental models into robust, scalable production systems
- Support feature engineering, training pipelines, and inference workflows
- Help define best practices for ML lifecycle management (training, validation, deployment, monitoring)
Data Quality, Governance & Best Practices
- Enforce best practices for building and maintaining data processes across Data Analyst and Data Science teams
- Define standards for:
- Data modeling and transformations
- Pipeline reliability and observability
- Testing, versioning, and documentation
- Improve data quality, consistency, and discoverability across the organization
Performance & Reliability
- Optimize systems for performance, scalability, and cost efficiency
- Monitor and troubleshoot data pipelines and ML systems in production
- Implement observability (logging, metrics, alerting) across data workflows
Requirements
- Strong programming skills in Python (or similar language)
- Proven experience building and maintaining production-grade data pipelines
- Hands-on experience with data processing frameworks (e.g., Spark or similar)
- Familiarity with DBT or modern data transformation workflows
- Experience working with cloud environments (AWS, GCP, or Azure)
- Solid understanding of data modeling, distributed systems, and ETL/ELT patterns
Preferred Qualifications
- Experience productionizing machine learning models and pipelines
- Familiarity with orchestration tools (e.g., Airflow, Dagster, Prefect)
- Experience building internal platforms or developer tooling
- Knowledge of feature stores, model serving, or real-time inference systems
- Experience optimizing compute costs and performance at scale
- Background working closely with Data Science and Analytics teams
What Success Looks Like
- Data pipelines are reliable, observable, and cost-efficient
- Data Science teams can move faster from research to production
- Clear and enforced best practices across data workflows
- Shared infrastructure reduces duplication and improves developer velocity
- ML-powered features are robust, scalable, and maintainable in production