Data Scientist

Location:

US, Remote

Position Summary

The Data Scientist makes predictive insights and builds decision-support tools from operational and performance data, combining hands-on data science (statistical modeling and machine learning) with ownership of industrial analytics stack. The Data Scientist serves as a PI System Administrator, Seeq Developer, and builder of AI-enabled tools that help engineers and operators act faster and with confidence.

The Data Scientist will work with industrial data systems—especially the PI System (OSIsoft PI / AVEVA PI)—to ensure reliable data availability and governed context and will develop Seeq analyses and Python-based pipelines for time-series modeling, anomaly detection, forecasting, and performance monitoring.With curiosity, rigor and comfort with large time-series datasets the Data Scientist is able to take solutions end-to-end—from PI tag and context configuration, to Seeq development, to Python/AI tool deployment—while collaborating with cross-functional technical teams.

Essential Functions

Data Science, Modeling, and Insights
- • Build statistical and machine-learning models to detect anomalies, forecast performance, and identify optimization opportunities
- • Design and evaluate experiments and model validation approaches; translate results into clear recommendations for engineering and operations
- • Develop dashboards, reports, and model performance metrics to communicate insights and drive data-informed decisions

PI System Administration & Time-Series Data Engineering
- Administer and support the PI System (OSIsoft PI / AVEVA PI), including tag strategy, data quality monitoring, and user support
- Build and maintain PI AF structure (assets, templates, attributes) and documentation to provide governed context for analytics and reporting
- Support PI interfaces/data flows and collaborate with OT/IT and engineers to validate sensors/tags, troubleshoot gaps, and improve reliability and performance
- Create curated datasets, features, and labels from PI data (with clear definitions and lineage) to support Seeq analyses and ML modeling

Seeq Development & AI-Enabled Tools
- Develop and maintain Seeq Workbooks/Analyses for performance monitoring, anomaly detection, and root-cause investigations
- Create reusable Seeq templates, calculation standards, and best practices; enable users through documentation and training
- Build AI-enabled tools (e.g., copilots, guided diagnostics, automated summaries) that leverage governed PI/Seeq context to accelerate engineering workflows
- Evaluate, monitor, and improve AI tool quality (accuracy, drift, user feedback), and implement practical guardrails for safe, reliable use

Python, Analytics Engineering & Deployment
- Develop and maintain Python-based pipelines for data extraction, preprocessing, modeling, and automation
- Prototype and productionize analytical applications that support performance monitoring, anomaly detection, and forecasting
- Automate recurring model runs, evaluations, and reporting workflows with attention to reproducibility and reliability
- Improve existing analytics codebases; contribute to model monitoring, documentation, and maintainable data science practices

Project & Engineering Partnership
- Collaborate with engineers and subject matter experts to frame operational problems into measurable data science objectives
- Provide analytical support for initiatives including data validation, statistical analysis, modeling, and performance reporting
- Help standardize modeling approaches, feature definitions, and evaluation metrics across projects

Data Quality, Governance & Monitoring
- Ensure accuracy and reliability of datasets used for analysis and modeling (validation checks, outlier handling, sensor sanity checks)
- Perform data cleaning, validation, and documentation, including assumptions, feature definitions, and dataset lineage
- Maintain organized analytical workflows and pipelines to support repeatable modeling and ongoing monitoring

Other Responsibilities

Other duties and projects as assigned by management

Education, Experience, and Skills Required

Bachelor’s degree in Data Science, Computer Science, Engineering, Statistics, Applied Math, or related field

2–5 years of experience in data science, applied analytics, or technical modeling roles

Strong Python skills for data science (e.g., pandas, numpy, scikit-learn; visualization libraries)

Strong skills in SQL and Excel for analysis, validation, and stakeholder-ready outputs

Experience with data visualization and reporting tools (Power BI, Tableau, or similar)

Strong statistical reasoning, analytical problem-solving skills, and attention to data quality

Ability to communicate technical findings clearly to both technical and non-technical stakeholders

Demonstrated experience using applied statistics, machine learning, and time-series modeling

Ability to use Python for data science and AI tooling (data wrangling, modeling, visualization; building assistants/automation)

Understand and apply PI System administration fundamentals and data engineering for high-frequency time-series (tags, quality checks, contextualization)

Seeq development (shared analyses, calculations, templates) and stakeholder-ready data storytelling

Ability to partner cross-functionally and tool and team enablement (requirements, training, documentation, adoption)

Lead and support deployment-minded practices (reproducibility, versioning, testing, monitoring) for analytics, models, and AI tools

Must have strong verbal and written communication skills

Must be able to read, write and speak English at a level which will permit the employee to accurately understand and communicate information to safely and efficiently perform the job duties

Ability to prioritize and plan work activities so time is used efficiently and effectively

Must demonstrate accuracy and thoroughness to ensure quality performance

Ability to identify and resolve problems in a timely manner

Preferred Qualifications
- Experience working with OSIsoft PI / AVEVA PI System (PI Data Archive, PI AF) and industrial time-series data
- Experience developing in Seeq (Workbench/Organizer), including building shared analyses and calculations
- Experience with operational/engineering datasets (e.g., power generation, rotating equipment, process systems)
- Familiarity with time-series methods (e.g., resampling, lag features, seasonality, change-point detection)
- Experience developing reusable analytics packages, APIs, or scheduled jobs for model execution
- Knowledge of predictive modeling (forecasting, classification/regression), anomaly detection, and model evaluation

Physical Requirements

The ability to work in an office environment and to work at a computer, and computer monitor, and use repetitive motion for long periods of time

The ability to periodically lift up to 15 lbs

Nothing in this job description restricts management’s right to assign or reassign duties and responsibilities to this job at any time.