Role SSE Energy GDPR: Difference between revisions

From MillerSql.com
NeilM (talk | contribs)
Created page with "text here"
 
NeilM (talk | contribs)
No edit summary
Line 1: Line 1:
text here
SSE - (formerly Scottish and Southern Energy plc)
 
The SSE GDPR contract ran from Tue 11th September 2018 to Fri 21st December 2018
 
At Penner Road, Havant
 
 
Teams: Data Sciences
 
Admin: Visual Studio Team Services, Citrix storefront, Harmony, ITWMS timesheets
 
Daily morning meetings on Skype
 
Required local admin rights for my machine to run IBM Data Manager (.pkg files)
 
The company was de-merging its retail division (merging with NPower) but this was cancelled.
 
BIDS - Business Intelligence Development Studio
 
SSDT - SQL Server Data Tools
 
Dev staging, Enterprise, and Presentation server
 
Dev, Pre-prod, Prod
 
Dialler_Extract
 
E-Commerce
 
E_SITE
 
Release spreadsheet, with a section of rows for each item released.
 
QA testing of releases.
 
Tivoli task scheduler
 
Duties:
 
To write SSIS packages, and IBM Data Manager packages, to pseudonymisation (or delete) user-sensitive data from SSE SQL Server databases (retail customer services databases), for GDPR purposes.
 
The packages had to be properly documented to describe version numbers and what they had to do.
 
IBM Data Manager objects: Data Sources, Attributes, (Hierarchies = sources), (template and Levels = destinations), Derivations, mapping, dimensions, Builds, JobStreams, surrogate keys
 
I wrote SQL code to identify dependencies up and down the chain in IBM Data Manager, so that we could see what other packages in Data Manager would be affected by changes made to individual Data Manager packages.
 
Two types of load: Truncate-load, delete and insert/append mode, and update mode.
 
Soft-deleted records (still present in database, but marked as deleted). RowAction=D
 
Two types of SSIS packages:
 
1. VCRO_CUST_ACCOUNT_CHKSUM.dtsx
 
This compares "source" and "previous" tables:
 
MIS.dbo.VCRO_CUST_ACCOUNT
 
BIS_CDC.dbo.VCRO_CUST_ACCOUNT_PREVIOUS
 
outputting the differences into truncated _CHK_CHANGES table:
 
BIS_CDC.dbo.VCRO_CUST_ACCOUNT_CHK_CHANGES
 
(note it only puts the key field and nature of the change into this table, not every column).
 
2. VCRO_CUST_ACCOUNT_UPDATE_PREVIOUS.dtsx
 
This deletes all rows in BIS_CDC.dbo.VCRO_CUST_ACCOUNT_PREVIOUS
 
that have rows in the _CHK_CHANGES table,
 
and then inserts new rows into the _PREVIOUS table that are in _CHK_CHANGES.
 
I think Data Manager does the actual data migration here??
 
See: 180912_SSIS_packages_Steve.txt
 
It uses checksums to do the comparison of the data.
 
Also _DELETES tables
 
S111_ECOM_DATA, S111_ONZ_DATA, E_REGISTRATION, E_ACTIVE_BASE, E_METER_READING, NBA Feed Analysis
 
Columns: Registration_BKey, RegistrationIDCode, SiteCode,

Revision as of 16:06, 27 September 2024

SSE - (formerly Scottish and Southern Energy plc)

The SSE GDPR contract ran from Tue 11th September 2018 to Fri 21st December 2018

At Penner Road, Havant


Teams: Data Sciences

Admin: Visual Studio Team Services, Citrix storefront, Harmony, ITWMS timesheets

Daily morning meetings on Skype

Required local admin rights for my machine to run IBM Data Manager (.pkg files)

The company was de-merging its retail division (merging with NPower) but this was cancelled.

BIDS - Business Intelligence Development Studio

SSDT - SQL Server Data Tools

Dev staging, Enterprise, and Presentation server

Dev, Pre-prod, Prod

Dialler_Extract

E-Commerce

E_SITE

Release spreadsheet, with a section of rows for each item released.

QA testing of releases.

Tivoli task scheduler

Duties:

To write SSIS packages, and IBM Data Manager packages, to pseudonymisation (or delete) user-sensitive data from SSE SQL Server databases (retail customer services databases), for GDPR purposes.

The packages had to be properly documented to describe version numbers and what they had to do.

IBM Data Manager objects: Data Sources, Attributes, (Hierarchies = sources), (template and Levels = destinations), Derivations, mapping, dimensions, Builds, JobStreams, surrogate keys

I wrote SQL code to identify dependencies up and down the chain in IBM Data Manager, so that we could see what other packages in Data Manager would be affected by changes made to individual Data Manager packages.

Two types of load: Truncate-load, delete and insert/append mode, and update mode.

Soft-deleted records (still present in database, but marked as deleted). RowAction=D

Two types of SSIS packages:

1. VCRO_CUST_ACCOUNT_CHKSUM.dtsx

This compares "source" and "previous" tables:

MIS.dbo.VCRO_CUST_ACCOUNT

BIS_CDC.dbo.VCRO_CUST_ACCOUNT_PREVIOUS

outputting the differences into truncated _CHK_CHANGES table:

BIS_CDC.dbo.VCRO_CUST_ACCOUNT_CHK_CHANGES

(note it only puts the key field and nature of the change into this table, not every column).

2. VCRO_CUST_ACCOUNT_UPDATE_PREVIOUS.dtsx

This deletes all rows in BIS_CDC.dbo.VCRO_CUST_ACCOUNT_PREVIOUS

that have rows in the _CHK_CHANGES table,

and then inserts new rows into the _PREVIOUS table that are in _CHK_CHANGES.

I think Data Manager does the actual data migration here??

See: 180912_SSIS_packages_Steve.txt

It uses checksums to do the comparison of the data.

Also _DELETES tables

S111_ECOM_DATA, S111_ONZ_DATA, E_REGISTRATION, E_ACTIVE_BASE, E_METER_READING, NBA Feed Analysis

Columns: Registration_BKey, RegistrationIDCode, SiteCode,