Once the delta records are captured, the next step is to add ALTER ROW option to UPSERT the data into target database based on its primary key column.Azure Data Factory Dataflows: Create the Azure Data Factory data flows for the source data and watermark table to be joined and filter the deltas required to load in Azure Db for PostgreSQL – Flexible Server database.Linked Service for Azure PGSQL: Create the linked service for the Azure DB for PostgreSQL – Flexible Server database connection the user must have permission to read and write for the schema and tables to be copied from Oracle database.
AZURE DATA FACTORY FOREACH FILE IN FOLDER UPDATE
Important to note: while loading the watermark table, capture the current timestamp as WATERMARKUPDVALUE as this will be needed in order to update the watermark table later for the next run.
Datasets: Create datasets for the data tables, deleted records and watermark table by using Azure Data Factory and create a pipeline with the trigger intervals required.Linked Service: Create the linked service required for the Oracle connection the user has to have read and write permission in the tables to be copied to Azure DB for PostgreSQL – Flexible Server.Azure Storage Account: Create an Azure storage account with the containers for the table data, watermark table, deleted records and logs.Staging for the table data: This may not be necessary, however it is better to offload the source environment dependency.
AZURE DATA FACTORY FOREACH FILE IN FOLDER INSTALL
Here are the steps to capture Incremental data from the source to target using Azure Data Factory Dataflows :. If the requirement is for a real time solution the recommendation is to consider other solutions available such as Open Source or 3rd Party products. This solution provides a delta load scenario where the data is moved in 30 minutes intervals from source to target. There are 2 parts in the solution, the 1st part covers the process, source configuration in Oracle, Staging configuration in the Azure Storage, Creating the linked services and self hosted integration runtime in Azure Data Factory and configuring the datasets in the source Oracle database and Azure Storage, the remaining sections are covered in Part 2 here :- Part 2 By using this solution there is no additional set up required in the source Oracle database as typically needed using LogMiner or a traditional CDC tool (Change Data Capture).
This blog post provides a step by step guide for incremental data load from Oracle to Azure Database for PostgreSQL – Flexible Server using Azure Data Factory dataflows.