What are the responsibilities and job description for the Data-AIOps Engineer position at Raas Infotek?
Data-AIOps Engineer (Onsite)
Fremont, CA - Hybrid
Mandatory Skills
Splunk, PowerShell, or Python, Alerts & Logs Monitoring, Confluence and SharePoint
JD
Job Summary:
Looking for an experienced Data Ops Engineer for proactive monitoring, supporting, and ensuring reliability of data pipelines and related infrastructure in an Azure-based ecosystem
This role requires strong attention to detail, adherence to SOPs, and effective communication with stakeholders to maintain service levels and operational efficiency.
Experience Requirements:
· 5 years in IT Operations, Data Engineering, or related fields.
· Experience in Azure Data Services, ETL/ELT processes, and ITIL-based operations.
· 2 years in AIOps implementation, monitoring, and automation.
Qualifications:
· Bachelor’s or master’s degree in computer science, Engineering, or a related field.
Key Skills:
· Splunk, PowerShell, or Python, Alerts & Logs Monitoring, Confluence and SharePoint
Skill Requirements:
· Basic Understanding of Azure Data Services (ADF, Synapse, Databricks).
· Experience in monitoring alerts from data pipelines (Azure Data Factory, Synapse, ADLS, MS Fabric etc.)
· Familiarity with ETL/ELT concepts, data validation, and pipeline orchestration.
· Experience in identifying failures in ETL jobs, scheduled loads, and streaming data services.
· Hands-on experience with IT monitoring tools (e.g. Splunk, Azure Monitor, Dynatrace, or similar tools).
· Skilled in creating and updating runbooks and SOPs.
· Familiarity with data refresh cycles, batch vs. streaming differences.
· Familiarity with ITIL processes for incident, problem, and change management.
· Strong attention to detail, ability to follow SOPs, and effective communication for incident updates.
· Solid understanding of containerized services (Docker/Kubernetes) and DevOps pipelines (Azure DevOps, GitHub Actions), always with an eye on data layer integration.
· Proficiency in Jira, Confluence and SharePoint for status updates and documentation.
· Understanding of scripting (PowerShell, Python, or Shell) for basic automation tasks.
· Ability to interpret logs and detect anomalies proactively.
· Analytical thinking for quick problem identification and escalation.
Preferred:
· Exposure to CI/CD for data workflows, real-time streaming (Event Hub, Kafka).
· Understanding of Data governance and compliance basics.
· Experience with anomaly detection, time-series forecasting, and log analysis.
Key Responsibilities:
· Monitor and support data pipelines on Azure Data Factory, Databricks, and Synapse.
· Perform incident management, root-cause analysis for L1 issues, and escalate as needed.
· Surface issues clearly & escalate to appropriate SME teams so they can be fixed at the root — avoid repetitive short fixes.
· Identify whether issues are at pipeline level, data source level, or infrastructure level and route accordingly.
· Document incident resolution patterns for reuse.
· Acknowledge incidents promptly and route them to the correct team.
· Execute daily health checks, maintain logs, and update system status in collaboration tools.
· Work strictly as per SOPs documented by the team.
· Maintain and update SOPs, runbooks, and compliance documentation.
· Update system health status every 2 hours during the shift in Confluence or SharePoint.
· Update incident status every 4 hours for P1/P2 tickets.
· Complete service tasks on time as per SLA to release queues quickly.
· Ensure compliance with enterprise data security, governance, and regulatory requirements.
· Collaborate with data engineers, analysts, DevOps/SRE teams and business teams to ensure reliability and security.
· Implement best practices in ML operations and productionization.
Thanks & Regards
Sameer Ahmad
Raas Infotek Corporation.
262 Chapman Road, Suite 105A,
Newark, DE -19702
Phone: (302) 565-0068 Ext: 143
E-Mail: sameer.ahmad@raasinfotek.com|Website: www.raasinfotek.com
Linkedin: linkedin.com/in/sameer-ahmad-031a0b185