Role Ovo Energy: Difference between revisions

From MillerSql.com
NeilM (talk | contribs)
NeilM (talk | contribs)
Line 95: Line 95:
The "App" migration - sending customer data to Paypoint.
The "App" migration - sending customer data to Paypoint.


Dealing with special characters.
Dealing with special characters in [[Skills_Other_Technical#Notepad++|Notepad++]]


Putting in artificial strings to be replaced with pipes to allow pipes to serve as CRLF characters. This allows original CRLF characters to be retained in the text.
Putting in artificial strings to be replaced with pipes to allow pipes to serve as [[Skills_Other_Technical#CRLF|CRLF]] characters. This allows original CRLF characters to be retained in the text.


Heritage vs Smart meters
Heritage vs Smart meters

Revision as of 20:38, 1 October 2024

Introduction

May 2020 to to December 2021.

Five extensions.

This was a 100% work from home project.

Nova migration. And for the last 3 months was the SOLR project.

Nova migration

Ovo purchased Spark Energy ltd. The project was to migrate customer details from Spark's systems into the SMT (Spark migration tool), and out to Nova (Ovo's system)

Initially working for **John S. But towards the end he left the project and I took over some of his responsibilities.

Staging, Tech, and Business databases on Apache Spark SQL on Databricks.

Daily .csv Files with todays' date in the filename sent via FTP by Spark Energy into Ovo's system, the S: drive.

Streamsets service copied them into Azure Blob Storage.

Using Databricks we created SQL read-only table objects for which these files held the data. If the files changed we had to run a SQL refresh table command to update the data in the table.

Ingestion: Next, the "staging" SQL jobs in Databricks (notebooks) loaded each day's files into a Azure Delta Lake read-write table ("staging tables"), appended to existing data, with the name of the inbound file added as one of the columns. input_file_name() function.

The preservation of previous files' data served as an archive of the inbound files' data. The idea here is that data storage is very cheap, hence we can use a lot of storage for this archive. But that processing power is more expensive. This means that the processing part of Databricks, the cluster, is started up to run jobs, and stopped when not in use.

Next the "tech" job truncate-loaded or append-loaded the data from the staging tables into the Delta Lake "tech" tables.

Next, scheduled daily and manual SQL jobs ran, to cleanse and process the data. Dealing with duplicates.

Next, SQL jobs ran to output processed data into output tables, in a format required for the destination systems (Nova and others).

Finally jobs ran to output the data from the "output" tables into Azure blob storage write-only tables (setting the "Extracted" field to "Y"), the process generating new .csv files from this data, which Streamsets copied out from Azure to the OVO S: drive, ready to be sent to the destination systems.

I ran these daily jobs, as well as investigating any errors or problems resulting from the jobs, and scripts run to determine the success of the jobs.

Also investigation of problems and changes from other parts of the business.

Business logic:

Each customer's meter would have a unique ID in the MCT Migration Control table (the MPXN), and each customer an ID called the Customer_skey. Customers are referenced with different ID numbers in the source: The Spark_acc_no.

At the start of the migration process the "status" column is set to 0. As each stage in the migration process is completed, its status will change to 1, 2, 3... to "7S".

Examples of stages are 0, 0S "Eligible", 1 "Selected", and so on to the data having been sent to all destinations. Note a status appended with Z, such as "4Z" means an error has occurred at this stage, requiring a manual or batch fix.

In order to progress the customer through each stage, a record would be added for each customer in the mct_process table with a particular task_id. Then the daily job that processes this particular task_id will see this record and process this customer in the prescribed way. Once complete the record in mct_process will be changed from "pending" to "in-flight" to "complete" - and its status in the mct table will be moved to the next status.

In general when a customer is starting a stage, its status will be appended with "A", such as "4A", and when the stage is complete it will be promoted to "4S".

_status tables would hold an archive of previous statuses of meters (a customer is referred to as a meter, with a customer typically having two meters one electric and one gas).

The admin_data table holds fixed system variables.

Different tranches of meters to process. "Mass Event 1, 2", "Mopup" tranche to handle the remaining items.

Initial Comms

Welcome Comms - Letters to customers sent out by Latcham

Open Enquiries

Open Complaints

Meter Reads

Opening Balance

"Secure" file to "secure meter" - who move the meters themselves across - the equivalent to the DCC.

MPXN:

MPAN = "Meter Point Admin Number" - Electric

MPRN = "Meter Point Reference Number" - Gas

SSD = Supplier Start Date

PAN number

id_crac

"Server Load" - Load of data into Nova

"Filemover" scheduled sFTP copying process

Files for manual processing were loaded onto Sharepoint, though this was phased out part way through the project.

Some customers were sent to "Boost" - Boost is part of the OVO business.

Documentation

Databricks widgets - Runtime variables to specify prod or dev systems.

The "App" migration - sending customer data to Paypoint.

Dealing with special characters in Notepad++

Putting in artificial strings to be replaced with pipes to allow pipes to serve as CRLF characters. This allows original CRLF characters to be retained in the text.

Heritage vs Smart meters

MASUS

cosgain_response

Awaiting CREGI

Multi_MPAN

Gentrack

Reporting dashboards

Backup Tables job, to backup all tables to the business database serving as an archive.

Python scripts to run in Databricks to list and delete files, and run notebooks.

SQL MERGE statement particular to Spark SQL: Merge using (table) when matched then update, else insert.

SOLR project

In the middle of 2021 there was a large increase in wholesale gas prices. This caused a number of energy companies with customers on fixed-rate price tariffs to get into financial difficulties and fail. In order to ensure those customers continued to be on supply for gas, the government operated a SOLR - Supplier of Last Resort scheme, that ensured that the customer was moved to another energy company. Ovo Energy was one of these SOLR companies.

The project was to write database code to allow customer details to be supplied to Ovo in a defined format, and make that data available to be imported into Ovo's internal content management systems.

Chris (my manager) and I re-purposed code used with the previous Ovo project, for this new use. With a readiness to adapt the code to cater for differences between different inbound data formats.

This was a short project for Ovo Energy, again working for Chris. It was cut short on 25th December when the company decided to discontinue bidding for SOLR customers.