Location: London, Potters Bar, Bristol or Isle of Man (hybrid working options available)
The Data Engineer is a hands-on technical role responsible for designing, developing, and maintaining data pipelines within the IT department. The pipelines will be realised in a modern lake environment and the engineer will collaborate in cross-functional teams to gather requirements and develop the conceptual data models. This role plays a crucial part in driving data-driven decision-making across the organisation, ensuring data availability, quality, and accessibility for various business needs.
This is not a line management role, but you will play a key part in guiding and upskilling more junior data engineers and setting data standards and guidelines.
What you'll do
- Design, model, develop and maintain data pipelines to ingest, store, process, and present data.
- Ensure data quality, accuracy, and consistency.
- Collaborate with data architects to ensure data pipelines align with the overall data architecture strategy.
- Perform data transformation tasks, including data cleansing, enrichment, and aggregation, to prepare data for analytics and reporting.
- Integrate data from structured and unstructured sources, ensuring compatibility and alignment with data models and business requirements.
- Automate data transformation processes to improve efficiency.
- Implement and maintain data quality checks and validation processes to identify and resolve data anomalies and errors.
- Monitor data pipelines for data quality issues and implement data quality improvements.
- Collaborate with business stakeholders to define data quality requirements.
- Collaborate with data architects and data scientists to design and implement data models, schemas, and structures.
- Ensure that data models support business reporting and analytics needs while optimizing query performance.
- Maintain data dictionaries and metadata to document data structures and relationships.
- Optimize data storage, retrieval, and query performance by implementing indexing, partitioning, and caching strategies.
- Monitor data processing performance and address bottlenecks as they arise.
- Stay updated with best practices in data processing performance tuning.
- Create and maintain documentation for data pipelines, data transformation processes, and data integration procedures.
- Foster a culture of knowledge sharing within the data engineering team and across the organization.
- Collaborate effectively with cross-functional teams, data stakeholders, and business units to understand data requirements and deliver data solutions that meet business needs.
- Communicate technical concepts and data solutions to non-technical stakeholders in a clear and understandable manner.
Knowledge/Skills/Experience
Essential
- Extensive experience in data engineering, including designing and developing data pipelines for retrieval / ingestion / presentation / semantics in an Azure environment.
- Strong ADF, DataBricks, SQL, Python, Power BI
- Data acquisition from various data sources including Salesforce, API, XML, json, parquet, flat file systems and relational data.
- Excellent team player able to work under pressure.
- Effective communication and collaboration skills to work with cross-functional teams and gather data requirements.
- Skills in data modelling (both structured and unstructured data) working directly with the business & data scientists.
- Ability to optimise data solutions for performance, scalability, and efficiency.
Desirable:
- Experience in a financial corporation
- Lake House / Delta Lake and Snowflake
- Experience with Spark clusters, both elastic permanent and transitory clusters
- Familiarity with data governance, data security, and compliance requirements.
- Power Automate.