This is a self-archived – parallel published version of this article in the publication archive of the University of Vaasa. It might differ from the original. Advanced utilization of big data for real-time monitoring and data analytics in Sundom smart grid Author(s): Hovila, Petri; Monot, Aurelien; Laaksonen, Hannu; Rita-Kasari, Matti Title: Advanced utilization of big data for real-time monitoring and data analytics in Sundom smart grid Year: 2019 Version: Publisher’s PDF Copyright ©2019 CIRED – International conference and exhibition on electricity distribution, published by AIM. Please cite the original version: Hovila, P., Monot, A., Laaksonen, H., & Rita-Kasari, M., (2019). Advanced utilization of big data for real-time monitoring and data analytics in Sundom smart grid. In: Proceedings of 25th International Conference on Electricity Distribution: CIRED 2019 : Madrid, 3-6 June 2019. Conference proceedings CIRED 0744. https://www.cired- repository.org/handle/20.500.12455/134 25th International Conference on Electricity Distribution Madrid, 3-6 June 2019 Paper n° 0744 CIRED 2019 1/4 ADVANCED UTILIZATION OF BIG DATA FOR REAL-TIME MONITORING AND DATAANALYTICS IN SUNDOM SMART GRID Petri HOVILA Aurelien MONOT Hannu LAAKSONENABB – Finland ABB Corporate Research – Switzerland University of Vaasa – Finlandpetri.hovila@fi.abb.com aurelien.monot@ch.abb.com hannu.laaksonen@uwasa.fi Matti RITA-KASARIJubic – Finlandmatti.rita-kasari@jubic.fi ABSTRACT Modern technologies are increasingly used in manyprotection, control and monitoring solutions for powerdistribution grids. Voltage and current measurements arenowadays usually converted to digital signals, which arefurther processed and stored in different locationsdepending on the application. At the same time, when dataprocessing and storing capacity is increasing, the numberof new data sources also increases. This means that thedata transfer capacity may also need to be increased atdifferent levels in future distribution network managementand protection architectures. This paper presents how IEC61850-9-2 standard based raw data streamed from theSundom Smart Grid can be processed and utilized forpower quality monitoring and analytics applications.Digitalization also enables several other parallelapplications since the same data stream can be used forseveral targets when the connectivity, interoperability anddata quality are sufficient. In this paper some of thesepotential parallel applications are also introduced. INTRODUCTION In the future, distributed energy resources (DER) and real-time measurements in active network management (ANM)will have a key role in improving electricity supplyreliability and local distribution grid resiliency. Accuratereal-time measurements and utilization of fastcommunication technologies will enable the real-timeresponse of the DER as well as improved monitoring andstate-estimation. Accurate state estimation is an importantinput for different ANM functionalities in distributiongrids, as well as for potential future big data analytics(BDA) based enhanced monitoring and predictiveprotection solutions.Handling the data generated in smart grids with traditionalanalyses is challenging, in particular to produce actionableinformation within required timeframes. In general theobjective of BDA is to investigate very large volumes ofdata from multiple smart grid components and transformthe data into meaningful inputs like patterns, trends, faultdetection and control commands for machine learningapplications. The outcome of these applications mayidentify operating trends leading to failure patterns of devices or components and in this way enable newproactive protection and predictive maintenance solutions[1].Power Quality (PQ) monitoring plays an important role inensuring reliable operation and electricity supply ofdistribution grids [2]. In the future, BDA based proactiveprotection functionalities will be increasingly based onadvanced, real-time utilization of centralized monitoringof many different PQ indices simultaneously frommultiple measurement points. Centralized monitoring ofPQ parameter trends, variations and temperaturemeasurements from different feeders and measurementpoints can potentially be enabled to detect faults beforethey happen, improve fault location and detectmeasurement errors. In addition, it can enable supply leveldifferentiation for customers and verification of therealized power quality for these purposes. In the future,analytics comprising BDA, machine learning and artificialintelligence has a significant role in making the smart gridsmore intelligent, cost- and energy efficient.One possibility is to utilize calculations which areintegrated into IEDs and send pre-processed values byGOOSE messages. This approach reduces the amount ofdata, but all calculations must be integrated into the IEDs.As a proof of concept, we developed DAGR (DataAnalytics for Smart Grids), a prototype solution for PQmonitoring based on IEC 61850-9-2 Sampled Value (SV)stream data. SUNDOM SMART GRID The developed DAGR platform is running in the SundomSmart Grid (SSG) pilot of ABB Oy, Vaasan Sähkö (localDSO), Elisa (telecommunication/IT company) andUniversity of Vaasa in Vaasa, Finland [3]-[6]. The goal ofthe SSG pilot has been to enhance the electricity supplyreliability by new grid automation solutions for moreaccurate earth-fault detection and localization in mixedcompensated distribution grids (overhead, OH-line &cable). In addition, the SSG focus has been to integrate andmanagement of renewables and other DER units in anintelligent way.The Sundom Smart Grid consists of one primary HV/MVsubstation and four secondary MV/LV substations (Fig. 1).Today there are two distributed generation (DG) units 25th International Conference on Electricity Distribution Madrid, 3-6 June 2019 Paper n° 0744 CIRED 2019 2/4 connected to SSG (Fig. 1). One full-power-converterbased wind turbine (3.6 MW) connected to MV networkwith own MV feeder J08 (Fig. 1) and another LV networkconnected inverter based PV unit (33 kW) at MV/LVsubstation TR4318 (Fig. 1).On top of the IEDs required to protect and operate theSSG, additional functionality in IEDs have been activatedin the substations in order to capture the IEEE 1588 time-synchronized measurements of current and voltage valuesfrom multiple points and transfer them to a data centerusing the IEC 61850-9-2 standard protocol. AdditionalIEDs have also been installed to secondary substations.This system sends its data in real time through the opticalfiber network (Fig. 1). All the data are sent to a centralizedserver (Fig. 1). In total, 20 IEDs are sending IEC 61850-9-2 SV streams with a sampling rate of 4 kHz, as well asGOOSE data. Figure 1. Sundom Smart Grid single line diagram (SLD)and data measurement points. REAL-TIME POWER QUALITYMONITORING To demonstrate that we could handle big data processingin real-time, we developed DAGR a solution for PQmonitoring. As defined by the IEC 61000 standards, weare computing the following parameters every ten periods:power, reactive power, frequency, total and fundamentalsignal RMS as well as THD and harmonics (up to the 40th)for current and voltage. DAGR software architecture A combination of open-source libraries, existing ABBlibraries, as well as newly developed software components are used in our implementation. Figure 2. DAGR software components. Open-Sourcesoftware is in black, existing ABB code in red, codedeveloped reusing some existing code in grey, and codedeveloped from scratch in white. Fig. 2 shows how the data flows from one softwarecomponent to another. The incoming sample value streamsare captured on the Ethernet port using libpcap (opensource). The Ethernet frames are then parsed using anexisting ABB internal library. We adapted and extended alibrary from ABB, which contains optimizedimplementations of preprocessing functions such as FFTW(Faster Fourier Transform in the West). The resultingcomputed parameters are stored using the open sourcetime-series database InfluxDB. A parallel data flowevaluates the resulting parameters to detect power qualitydisturbances using a software component developedduring the project. If a disturbance is detected, the raw datacorresponding to the detected disturbance is stored inanother running instance of InfluxDB dedicated to powerquality disturbances. At the same time, we also run anothersoftware component to classify the type of disturbance, asdescribed next. PQ disturbance classification We implemented a mechanism to detect and classifydisturbances based on [7]. Specifically, a rule-baseddecision tree is used for classification, where the rules areestablished based on electrical engineering knowledge.This classifier was chosen because it is efficient and cantherefore be applied in real-time, and because it usesparameters, namely FFT and RMS, that are alreadycomputed. A test with simulated signals resulted inclassification accuracy of 94% for a sound to noise ratio of30dB. Furthermore, we compute the duration of thedetected disturbance and store the corresponding raw dataso that further analyses can be run. Handling big data in real-time The efficient processing of the power quality parameterswas obtained by using multi-threading while avoidingconcurrency issues by design, and thus avoiding theutilization of mutex (mutual exclusion) primitives whichwould have slow down the process significantly.The raw data is extracted using an existing IEC61850-9-2 25th International Conference on Electricity Distribution Madrid, 3-6 June 2019 Paper n° 0744 CIRED 2019 3/4 parser on the raw network data. Data coming from eachsample value streams (i.e., measurement point) is copiedin its corresponding ring buffer. There are as many ringbuffers as sample value streams which is staticconfiguration parameter described in a file.Fig. 3 describes the different threads running in parallel aswell as the data of the batch of sample each thread ishandling. The insertion “fill” threads (in green) arerunning on batches of size X. The “processing” threads (inorange) are working with batches of size Y. And the“store” threads (in red) are working on batches of size Z.A fourth type of “query” threads (in blue) are responsibleof sending data to the databases. As doing single insertionsin the database is costly, doing insertions by batches allowsto use the time-series database more efficiently. Figure 3. Batch processing of the raw data. Doing such batch processing allows to improveperformance for two reasons. First, processing data inbatches allows to decrease the fixed overhead of certainoperations such as doing requests to the database.Secondly, this allows to take advantage of the CPU cacheand get a significant speed-up, in particular whenprocessing the raw data to compute power quality metricsusing math algorithms such as a Fourier Transform.Fig. 4 shows the batch sizes used in DAGR starting fromthe raw data. For a 50Hz line, each input stream amountsto 4000 samples per second corresponding to 800 samplesfor 10 cycles, the window defined in the standard tocompute the relevant metrics for PQ monitoring. Eachprocessing thread operates on batches of between 4 000and 12 000 samples. Finally, storage is triggered whenmetrics are computed on the equivalent of between 320000 and 640 000 samples. Figure 4. Batch processing of the raw data. Performance and sizing These implementation optimizations allow to scale oursolution in multiple ways. Test experiments have shownthat our implementation can handle up to 80 SV streamson a COTS PC (Quad-Core i5), which is 4 times the sizeof the Sundom Smart Grid, or up to 10 streams on araspberry pi 3.Table 1 describes the required amount of raw data totransfer and process per stream and for our smart grid pilot.We focus here on the SV streams as the GOOSEbandwidth is negligible in comparison. For this estimation,we use SV packets of 140 bytes (B), the maximum size,and a sampling rate of 4 kHz. Table 1. Amount of raw data to be transferred andprocessed for SV streams with current and intensity withthree phases and neutral.Time span Per SV stream For the Sundom SmartGrid1 second ~ 500 KB ~ 10 MB1 day ~ 40-45 GB ~ 800 – 900 GB1 year ~ 15 TB ~ 300 TB When running DAGR, we are storing around 30GB of dataper day in our database. The computed metrics and rawsignal (in case of disturbance) are stored using InfluxDB,an open-source real-time database and are displayed usingGrafana as shown in Fig. 5. Figure 5. Screenshot of the interface of the DAGR tooldisplaying power quality parameters. Although we observe a reduction of a factor around 30 indata size between the raw signal and the resulting metrics,this results in a significant amount of data. This amountsto around 10 TB per year. Since the metrics are computedevery 10 cycles, we have a granularity of 200ms for a gridoperating a 50 Hz. Depending on the use case, suchgranularity might not always be needed. Down-samplingthe output of our solution would allow to reduce therequired storage space greatly, while still being detailedenough for the considered use cases. 25th International Conference on Electricity Distribution Madrid, 3-6 June 2019 Paper n° 0744 CIRED 2019 4/4 FUTURE APPLICATIONS In this paper we address the possibility to utilize SVstreams from multiple locations as a source for advancedanalytics and real-time monitoring of distribution network.Since the amount of raw real-time data output as specifiedin IEC 61850-9-2 is large, using these sources for big dataapplications requires smart filtering and pre-processing ofdata for different applications (Table 2). Table 2. Different type of data (data filtering and pre-processing) and data storage needed in future big dataapplications in smart grids.Data storagetime Applicationexamples Type of datams - sec Real timeprotection samples or pre-processedsec - min Localmonitoring samples or pre-processedhours - days Local statistics e.g. averagesyears Historicallybased statisticsand analytics Snapshot of samples,pre-processed oraverage In addition to PQ measurements, other kind of digitizeddata (on-load-tap-changer, OLTC, number of operations,temperature and thermal and noise, voice sensorsmeasurements of network components like feeders,transformers, circuit-breakers etc.) could be stored andutilized with advanced IoT and big data-analytic tools andcloud-based platforms to enable future proactiveprotection and predictive maintenance functionalities ofsmart digital primary and secondary substations. This willenhance the electricity supply reliability as well as cost-material- and energy-efficiency of future smart grids.In general, IEEE 1588 time-synchronized IEC 61850-9-2SV current and voltage measurements could be also usedin a similar manner as synchrophasor or phasormeasurement unit (PMU) measurements. Synchrophasorapplications are generally categorized as 1) Offlineapplications (disturbance analysis and power systemdynamic model tuning and validation), 2) Onlinemonitoring applications (enhanced power system stateestimation, frequency and phase angle monitoring, linethermal loading monitoring, voltage instabilitymonitoring, and oscillation monitoring), and 3) Real timecontrol/protection applications (power flow control,oscillation and damping control, and emergency controlagainst voltage, rotor angle, or frequency instability).Recent developments in ICT technologies andinfrastructures increased the feasibility of synchrophasorapplications. Through improved real-time state-estimationand topology verification by accurate, time-synchronizedsynchrophasor-/PMU measurements, distribution networkthe DG hosting capacity could be increased withoutnetwork reinforcements, as well as enable the utilization ofactive network management (ANM) functionalities inlarge-scale. CONCLUSIONS In general, centralized monitoring of PQ parameter trends,variations and temperature measurements from differentfeeders and measurement points could be increasinglyused in the future for proactive protection and predictivemaintenance functionalities by big data analytics tools. Interms of data transfer and processing, it is important to takeinto account the needs of different applications. In thefuture, local and distributed control architectures canprovide solutions that can reduce the data transmissionload and computational resources required by fullycentralized solutions.This paper presented how IEC 61850-9-2 standard basedraw data streams could be processed hierarchically andutilized in PQ monitoring and analytics application. As aproof of concept, DAGR (Data Analytics for Smart Grids),a prototype solution for PQ monitoring based on IEC61850-9-2 Sampled Value (SV) stream data wasdeveloped for the Sundom Smart Grid pilot. At the end ofthe paper some other potential parallel applicationsutilizing the same SV data stream were also introduced. REFERENCES [1] IEEE Smart Grid Big Data Analytics, MachineLearning and Artificial Intelligence in the Smart GridWorking Group, 2017, “Big Data Analytics in theSmart Grid, IEEE White paper”,(https://smartgrid.ieee.org/resources/white-papers/big-data-analytics-in-the-smart-grid)[2] E. Gasch, J. Meyer, P. Schegner, K. Schmidt, 2015,”Efficient Power Quality Analysis of Big Data (CaseStudy for a Distribution Network Operator”, 23rdInternational Conference on Electricity Distribution(CIRED 2015), Lyon, France.[3] A. Monot, M. Wahler, J. Valtari, M. Rita-Kasari andJ. Nikko, 2016, "Real-time research lab in theSundom smart grid pilot", CIRED 2016 Workshop,Helsinki, Finland.[4] H. Laaksonen, P. Hovila, 2016, “FlexZone Concept toEnable Resilient Distribution Grids – Possibilities inSundom Smart Grid”, CIRED 2016 Workshop,Helsinki, Finland.[5] H. Laaksonen, P. Hovila, 2017, “Future-proofIslanding Detection Schemes in Sundom Smart Grid”,24th International Conference on ElectricityDistribution (CIRED 2017), Glasgow, Scotland.[6] H. Laaksonen, P. Hovila, K. Kauhaniemi, K. Sirviö,2018, “Advanced Islanding Detection in GridInteractive Microgrids”, CIRED Workshop 2018,Ljubljana, Slovenia.[7] M. Zhang, K. Li and Y. Hu, 2011, "A Real-TimeClassification Method of Power QualityDisturbances", Electric Power Systems Research vol.81:2, 660-666.