Role SSE Energy GDPR: Difference between revisions
No edit summary |
|||
(6 intermediate revisions by the same user not shown) | |||
Line 37: | Line 37: | ||
[[Skills_Other_Technical#Other|Tivoli task scheduler]] | [[Skills_Other_Technical#Other|Tivoli task scheduler]] | ||
Duties | == Duties == | ||
To write [[Skills_SQL_Server_Development#SSIS|SSIS]] packages, and IBM Data Manager packages, to apply pseudonymisation (or delete) user-sensitive data from SSE [[Skills_SQL_Server_Development#SQL Server|SQL Server]] databases (retail customer services databases), for GDPR purposes. | |||
To write SSIS packages, and IBM Data Manager packages, to pseudonymisation (or delete) user-sensitive data from SSE SQL Server databases (retail customer services databases), for GDPR purposes. | |||
The packages had to be properly documented to describe version numbers and what they had to do. | The packages had to be properly documented to describe version numbers and what they had to do. | ||
Line 45: | Line 44: | ||
IBM Data Manager objects: Data Sources, Attributes, (Hierarchies = sources), (template and Levels = destinations), Derivations, mapping, dimensions, Builds, JobStreams, surrogate keys | IBM Data Manager objects: Data Sources, Attributes, (Hierarchies = sources), (template and Levels = destinations), Derivations, mapping, dimensions, Builds, JobStreams, surrogate keys | ||
I wrote SQL code to identify dependencies up and down the chain in IBM Data Manager, so that we could see what other packages in Data Manager would be affected by changes made to individual Data Manager packages. | I wrote [[Skills_SQL_Server_TSQL|SQL code]] to identify dependencies up and down the chain in the back end of IBM Data Manager(which was a SQL Server database), so that we could see what other packages in Data Manager would be affected by changes made to individual Data Manager packages. | ||
Two types of load: Truncate-load, delete and insert/append mode, and update mode. | Two types of load: Truncate-load, delete and insert/append mode, and update mode. | ||
Line 76: | Line 75: | ||
I think Data Manager does the actual data migration here?? | I think Data Manager does the actual data migration here?? | ||
It uses checksums to do the comparison of the data. | It uses checksums to do the comparison of the data. |
Latest revision as of 19:49, 2 October 2024
Overview
SSE - (formerly Scottish and Southern Energy plc)
The SSE GDPR contract ran from Tue 11th September 2018 to Fri 21st December 2018
At Penner Road, Havant
Teams: Data Sciences
Admin: Visual Studio Team Services, Citrix storefront, Harmony, ITWMS timesheets
Daily morning meetings on Skype
Required local admin rights for my machine to run IBM Data Manager (.pkg files)
The company was de-merging its retail division (merging with NPower) but this was cancelled.
BIDS - Business Intelligence Development Studio
Dev staging, Enterprise, and Presentation server
Dev, Pre-prod, Prod
Dialler_Extract
E-Commerce
E_SITE
Release spreadsheet, with a section of rows for each item released.
QA testing of releases.
Duties
To write SSIS packages, and IBM Data Manager packages, to apply pseudonymisation (or delete) user-sensitive data from SSE SQL Server databases (retail customer services databases), for GDPR purposes.
The packages had to be properly documented to describe version numbers and what they had to do.
IBM Data Manager objects: Data Sources, Attributes, (Hierarchies = sources), (template and Levels = destinations), Derivations, mapping, dimensions, Builds, JobStreams, surrogate keys
I wrote SQL code to identify dependencies up and down the chain in the back end of IBM Data Manager(which was a SQL Server database), so that we could see what other packages in Data Manager would be affected by changes made to individual Data Manager packages.
Two types of load: Truncate-load, delete and insert/append mode, and update mode.
Soft-deleted records (still present in database, but marked as deleted). RowAction=D
Two types of SSIS packages:
1. VCRO_CUST_ACCOUNT_CHKSUM.dtsx
This compares "source" and "previous" tables:
MIS.dbo.VCRO_CUST_ACCOUNT
BIS_CDC.dbo.VCRO_CUST_ACCOUNT_PREVIOUS
outputting the differences into truncated _CHK_CHANGES table:
BIS_CDC.dbo.VCRO_CUST_ACCOUNT_CHK_CHANGES
(note it only puts the key field and nature of the change into this table, not every column).
2. VCRO_CUST_ACCOUNT_UPDATE_PREVIOUS.dtsx
This deletes all rows in BIS_CDC.dbo.VCRO_CUST_ACCOUNT_PREVIOUS
that have rows in the _CHK_CHANGES table,
and then inserts new rows into the _PREVIOUS table that are in _CHK_CHANGES.
I think Data Manager does the actual data migration here??
It uses checksums to do the comparison of the data.
Also _DELETES tables
S111_ECOM_DATA, S111_ONZ_DATA, E_REGISTRATION, E_ACTIVE_BASE, E_METER_READING, NBA Feed Analysis
Columns: Registration_BKey, RegistrationIDCode, SiteCode,