What are the responsibilities and job description for the Data Engineer II position at Jobright.ai?
Verified Job On Employer Career Site
Job Summary:
Brain Corp is a San Diego-based AI company focused on creating transformative technology for the robotics industry. The Software Engineer II, Data Engineering will design and maintain data infrastructure to support the Data Science team, focusing on building scalable data pipelines and ensuring data integrity for machine learning models.
Responsibilities:
• Design and Develop Data Pipelines: Design, develop, and maintain robust and scalable data pipelines for ingesting, transforming, and preparing data specifically for machine learning model training, validation, and inference.
• Feature Engineering & Data Modeling: Implement feature engineering processes and contribute to the design of data models optimized for machine learning workflows, including managing data lineage and versioning for reproducible ML experiments.
• Optimize Performance and Scalability: Enhance the efficiency and scalability of data pipelines and storage systems by identifying bottlenecks, and configuring cluster resources.
• Data Science Collaboration & Support: Partner closely with data scientists to understand their data needs, provide expertise in data preparation, and support their infrastructure requirements, including feature store management and model deployment.
• Data Quality & Governance: Implement and enforce data quality standards, security best practices (encryption, access control), and compliance policies for all data, especially sensitive data used in machine learning.
• Incident Response & Troubleshooting: Monitor data pipelines and ML data flows, troubleshoot failures, and resolve reliability issues to ensure data freshness and support ML model performance.
Qualifications:
Required:
• BS or MS in Computer Science or applicable engineering discipline
• 2-5 years of proven software development experience, with at least 2 of those years focused on data engineering
• Proficiency in SQL and the Python programming language
• Experience with data warehousing (BigQuery, Snowflake, MySQL, PostgreSQL)
• Familiarity with stream processing frameworks (Apache Beam, Pub/Sub, Spark Streaming)
• Understanding of data governance, security, and compliance best practices
• Strong problem-solving skills and ability to work independently on projects
• Familiarity using Generative AI tools to enhance development workflows, such as code generation, data exploration, and documentation support
• Effective communication skills demonstrated by effective written and verbal communication; with an ability to articulate technical concepts to both technical and non-technical stakeholders
Preferred:
• Experience with machine learning models and data science methodologies
• Experience with Google Cloud and their data ecosystem
• Familiarity with BI tools (e.g., Tableau, Power BI) and data frameworks (e.g., Hadoop, Spark)
• Experience with infrastructure as code (e.g., Terraform, Pulumi) and containerization and orchestration tools (e.g., Docker, Kubernetes) for ML model deployment
Company:
Brain Corp develops core technology for the robotics industry. Founded in 2009, the company is headquartered in San Diego, California, USA, with a team of 201-500 employees. The company is currently Late Stage. Brain Corp has a track record of offering H1B sponsorships.