Will design, develop, and maintain end-to-end ETL pipelines using Python, PySpark, SQL, and Spark SQL to ingest, transform, validate, and load structured and semi-structured data from multiple internal and external data sources. Build scalable big data processing solutions using Apache Spark and Databricks to support batch and streaming workloads while ensuring performance, fault tolerance, and efficient resource utilization. Develop and apply Spark user-defined functions (UDFs) in Python to implement complex business logic, custom data transformations, and advanced data processing requirements. Perform data architecture and data modelling activities, including logical and physical data models, analytical datasets, and curated data layers to support reporting, analytics, and downstream application consumption. Develop and enforce data quality and validation frameworks, including schema validation, reconciliation checks, anomaly detection, and exception handling to ensure data accuracy, consistency, and audit readiness. Develop and maintain RESTful APIs and lightweight FastAPI services in Python to expose curated datasets, enable system integrations, and support application-level data access. Develop cloud-based data solutions using AWS and Databricks, leveraging services such as Amazon S3, EC2, AWS Lambda, AWS Glue, AWS Step Functions, and Databricks, while following established security, scalability, and operational best practices. Orchestrate and automate data workflows using workflow orchestration tools such as Apache Airflow or Dagster, including scheduling, dependency management, retries, monitoring, and alerting. Create analytical views and dashboards using Python-based visualization libraries such as Dash, Matplotlib, and Plotly to support exploratory analysis, operational monitoring, and business reporting. Optimize data processing performance by tuning Spark jobs, UDF execution, SQL queries, partitioning strategies, and file formats to improve execution time and control cloud infrastructure costs.
Position requires up to 100% domestic travel. This position is for full-time, salaried (W-2), permanent employment.
REQUIREMENTS:
Master’s Degree or foreign equivalent in Information Technology, Information Systems, Engineering, or a related field
Please reference Job Number 636234 when sending resumes. Please mail resumes to: HR, Beacon Hill Solutions Group, LLC, 20 Ashburton Place, 5th Floor, Boston, MA 02108.
California residents: Qualified applications with arrest or conviction records will be considered for employment in accordance with the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act.
If you would like to complete our voluntary self-identification form, please click here or copy and paste the following link into an open window in your browser: https://jobs.beaconhillstaffing.com/jobs/eeoc/
Completion of this form is voluntary and will not affect your opportunity for employment, or the terms or conditions of your employment. This form will be used for reporting purposes only and will be kept separate from all other records.
Beacon Hill offers a robust benefit package including, but not limited to, medical, dental, vision, and federal and state leave programs as required by applicable agency regulations to those that meet eligibility. Upon successfully being hired, details will be provided related to our benefit offerings.