General Responsibilities
Design, develop and implement ingestion framework from Oracle source to Azure Data Lake - initial load and incremental ETL.
Used tools are:
Oracle Golden Gate (knowledge and experience are an asset but not required) for data ingestion and change data capture (currently in final stages of proof of concept)
Azure Data Factory (expert knowledge) to maintain pipeline from Oracle to Azure Data Lake
Azure Databricks/PySpark (good Python knowledge required) to build transformations of raw data into curated zone in the data lake
Azure Synapse to build stored procedures and read data from data lake
Review the requirements, database tables, and database relationships - Identify gaps and inefficiencies in current production reporting environment and provide recommendations address them in the new platform.
Design ingesting framework and CDC - preferred tool is Oracle Golden Gate
Prepare design artifacts
Work with IT partner on configuration of Golden Gate - responsible to provide direction and "how to".
Analysis of data - physical model mapping from data source to reporting destination.
Understand the requirements. Recommend changes to the Physical model.
Develop the scripts of physical model, and create DB.
Access Oracle DB environments, use SSIS, SQL Server and other development tools for developing solution.
Proactively communicate with business on any changes required to conceptual, logical and Physical models, communicate and review dependencies and risks.
Development of ETL strategy and solution based on different set of modules
Understand the Tables and Relationships.
Create low level design documents and unit test cases.
Create the workflows of package design
Development and testing of data with Incremental and Full Load.
Develop high quality ETL mappings/scripts/jobs
ETL data from Applications to Data Warehouse
ETL data from Data Warehouse to Data Mart
Perform unit tests.
Performance Review, data Consistency checks
Troubleshoot performance issues, ETL Load issues, log activity for each Individual package and transformation.
Review Performance of ETL Overall.
End to end Integrated testing for Full Load and Incremental Load
Plan for Go Live, Production Deployment.
Create production deployment steps.
Configure parameters, scripts for go live. Test and review the instructions.
Create release documents and help build and deploy code across servers.
Go Live Support and Review after Go Live.
Review existing ETL process, tools and provide recommendation on improving performance and reduce ETL timelines.
Review Infrastructure and any pain points for overall process improvement
Knowledge Transfer to staff, development of documentation on the work completed.
Document share and work on the ETL end to end working knowledge, Troubleshooting steps, configuration and scripts review.
Transfer documents, scripts and review of documents.
Experience:
Experience of 7+ years of working with SQL Server, SSIS, and T-SQL Development
Experience of 2+ years of working with Azure SQL Database, Azure Data Factory, Databricks and Python development
Experience building data ingestion and change data capture using Oracle Golden Gate
Experience working with building Databases, Data Warehouse and Data Mart and working with delta and full loads
Experience with any ETL tools such as SQL Server SSIS, ADF, Cloud toolsExperience working with MSSQL Sever on premise and within Azure Environment
Experience on Data modeling, and tools – e.g. SAP Power Designer,
Experience with snowflake and star schema model. Experience in designing data warehouse solutions using slowly changing dimensions.
Experience working with SQL Server SSIS and other ETL tools, solid knowledge and experience with SQL, other RDBMS (Oracle, PL/SQL)
Understanding data warehouse architecture with a data vault, dimensional data and fact model.
Analyze, design, develop, test and document ETL programs from detailed and high-level specifications, and assist in troubleshooting.
Utilize SQL to perform tasks other than data transformation (DDL, complex queries)
Good knowledge of database performance optimization techniques
Ability to assist in the requirements analysis and subsequent developments
Ability to conduct unit tests and assist in test preparations to ensure data integrity
Work closely with Designers, Business Analysts and other Developers
Liaise with Project Managers, Quality Assurance Analysts and Business Intelligence Consultants
Design and implement technical enhancements of Data Warehouse as required.
Skills:
7+ years in ETL tools such as Microsoft SSIS, stored procedures (Must Have)
2+ Azure Data Lake and Data Warehouse, and building Azure Data Factory pipelines (Must Have)
2+ years Python (Must Have)
Databricks
Synapse
Oracle Golden Gate
SQL Server
Oracle
Ability to present technical requirements to the business
Assets:
Knowledge and experience building data ingestion, history, change data capture using Oracle Golden Gate is major asset but is not required.