Jeremi Junka 

Literature Review of the ETL and ELT Data 
Integration Pipelines 

 
Vaasa 2026 

     School of Technology and Innovations 
 Bachelor’s thesis in Technology  

Data architecture 


2 

UNIVERSITY OF VAASA 
School of Technology and Innovations  
Author:    Jeremi Junka 
Title of the Thesis:  Literature Review of the ETL and ELT Data Integration Pipelines  
Degree:    Bachelor’s thesis in Technology 
Programme:   Data architecture 
Supervisor:   Maarit Välisuo 
Year:    2026 Sivumäärä: 37 

ABSTRACT: 
Tämä tutkielma on kirjallisuuskatsaus ETL (Extract-Transform-Load)- ja ELT (Extract-Load-Trans-
form) -dataintegraatioprosesseista. Kirjallisuuskatsaus keskittyy prosesseihin ja niiden toimin-
taan ja siihen, miten niitä voidaan optimoida suorituskyvyn näkökulmasta. Sekä ETL- että ELT-
prosesseja käytetään samaan tarkoitukseen, mutta prosessien toteutus eroaa merkittävästi toi-
sistaan. ETL-prosessoinnissa muunnosprosessi suoritetaan ennen datan lataamista kohdejärjes-
telmään, kun taas ELT-prosessoinnissa muunnosprosessi suoritetaan datan kohdejärjestelmään  
lataamisen jälkeen. Tässä tutkielmassa keskitytään kolmeen päävaiheeseen, jotka esiintyvät mo-
lemmissa dataintegraatioputkimalleissa. Tässä kirjallisuuskatsauksessa tutkitaan näiden kolmen 
päävaiheiden optimointia. Tutkielmassa vertaillaan myös näiden kahden prosessointiprosessi-
mallin soveltuvuutta eri arkkitehtuureihin. 

KEYWORDS: Data, data processing 
  

3 

Contents 

1 Introduction 6 

1.1 Background and motivation 6 

1.2 Problem definition 6 

1.3 Research objectives and scope 7 

1.4 Structure of the literature review 7 

2 Stages of data integration process 8 

2.1 Data extraction phase 8 

2.2 Data transformation phase 8 

2.3 Data load phase 9 

3 ETL 11 

3.1 ETL data processing method 11 

3.1.1 Data extraction process for ETL 11 

3.1.2 Data transformation process for ETL 12 

3.1.3 Data loading process for ETL 13 

3.1.4 Data transfer process within ETL processes 13 

3.2 Key performance indicators 14 

3.2.1 Throughput 14 

3.2.2 Latency 14 

3.3 Optimization of ETL models 15 

3.3.1 Data extraction optimization 16 

3.3.2 Data transformation optimization 16 

3.3.3 Data loading optimization 17 

3.4 Relevance of ETL tools in data architecture 18 

4 ELT 19 

4.1 ELT data processing method 19 

4.1.1 Data extraction process for ELT 19 

4.1.2 Data load process for ELT 20 

4.1.3 Data transformation process for ELT 20 


4 

4.1.4 Data transfer process within ELT processes 21 

4.2 Key performance indicators 22 

4.2.1 Throughput 23 

4.2.2 Latency 23 

4.3 Optimization of ELT models 24 

4.3.1 Data extraction process optimization 24 

4.3.2 Data load process optimization 25 

4.3.3 Data transformation process optimization 25 

4.4 Relevance of ELT tools in data architecture 26 

5 Comparison between ETL and ELT tools 27 

5.1 Overall process comparison 27 

5.2 Process comparison between ETL and ELT tools 27 

5.2.1 Extraction process review 28 

5.2.2 Transformation process review 29 

5.2.3 Load process review 30 

5.3 Suitability for different environments and use cases 31 

6 Conclusions 32 

References 34 

  
5 

 
Figures 
 
Figure 1 “A diagram depicting an ETL pipeline for the integration of dashboard data.” 

(Myakala, Bura & Juma, 2024). 14 

Figure 2 "ELT Process (Qlik, n.d.)". 22 

 
Abbreviations 
 
ETL - Extract-Transform-Load  

ELT - Extract-Load-Transform   


6 

1 Introduction 

Data pipelines play a key role in modern data management. They can be utilized for var-

ying purposes for example data collection or information delivery. ETL and ELT pipelines 

have been widely adopted into use as data integration needs have increased over time. 

These pipeline models have many similarities as both apply a phased structure to pro-

cess data, but they also have differences such as the data transformation phase location. 

ETL pipelines are the more traditional model. In the ETL pipeline model data is trans-

formed before loading into the target system. ETL pipelines offer strong data governance 

possibilities. ELT pipeline models handle data transformation after loading the data into 

the target system using its operational capabilities. ELT pipelines offer scalable cloud 

storage options for the storage and transformation of data. Both pipeline models are 

used in the collection, remodelling and storage of data. The pipeline models architecture 

and process differ but both are used to achieve the same goal. 

 
1.1 Background and motivation 

The motivation of this thesis is the limited amount of research in comparing the ETL and 

ELT data integration pipeline processes. The studies on ETL and ELT pipelines where the 

scope of study is only either one of the pipeline models are numerous, but comparative 

studies on both are rare to be especially from the performance standpoint. The in-

creased use of data integration tools also presents a need for further study in this field. 

 
1.2 Problem definition 

This literature review attempts to study the similarities and differences of ETL and ELT 

pipelines from the standpoint of process, performance and optimization. With the prior 

research into this topic this literature review examines how ETL and ELT pipelines oper-

ate and evaluate how these pipelines can be optimized. 


7 

1.3 Research objectives and scope 

The scope of this study is a review of ETL and ELT data integration pipelines, their archi-

tecture and processes based on academic studies and industry articles.  The study mainly 

focusses on the processes within the pipelines and the optimization of these processes 

from the standpoint of enhancing performance. While no benchmarks are used in this 

study the analysis is based on findings within relevant documentation. 

 
1.4 Structure of the literature review 

The literature review starts by examining the main phases of both ETL and ELT data inte-

gration pipelines. After which each pipeline model is examined in a chapter individually. 

The individual examination chapters include a review of the structures of the pipeline, 

optimization of each phase of the structure and key performance indicators. This is fol-

lowed by a comparison of the two pipeline models which include an assessment of sim-

ilarities and differences of the pipeline models and an analysis of which pipeline models 

are suitable for certain cases. The final chapter is a summary of the findings of previous 

chapters.  

 
8 

2 Stages of data integration process 

Both the ETL and ELT processes occur in phases. These phases are categorised as data 

extraction, data transformation and data load. Each of these phases are an integral part 

of the process and are necessary for the system to operate.  

 
2.1 Data extraction phase 

The data extraction phase is the process in which both ETL and ELT processes extract 

data from multiple or heterogeneous sources. In this phase of both the ETL process and 

The ELT process the system gathers data for further processing in the later stages of both 

processes.  Nithis et al. (2024) states that in data integration pipelines the first stage is 

the extraction phase. This is applicable for both ETL and ELT data integration pipelines. 

This data collection is commonly done on a large scale as to have the necessary amount 

of data to process into valuable information. The article goes into detail on how different 

source systems such as databases and APIs along other data sources can be used in the 

extraction process a both structured and unstructured data can be extracted Shaker, Ab-

deltawab & El Bastawissy al. (2011) states that the extraction phase is the first faze con-

ducted in an ETL pipeline while also emphasising its responsibility to extract the data. 

 
The optimization of the extraction phase is similar in both ETL and ELT cases as the pipe-

lines are collecting data for the same source. The key difference between the procedures 

becomes more prevalent in the later stages of the data integration process. 

 
2.2 Data transformation phase 

The data transformation phase in both ETL and ELT pipeline solutions handles the same 

responsibilities. These responsibilities are case based and can be tuned to fit specific 

business rules and interests. Seenivasan (2022) expresses that the transformation phase 


9 

is used for the conversion of data into a desired form usable in the target system. Exam-

ples of procedures that the transformation process is responsible for can include but are 

not limited to data cleansing, data standardisation, data enrichment, data validation, 

data restructuring, type conversion and data restructuring. Gill (2020) provides examples 

frequently used transformation techniques such as data cleansing, standardization and 

validation. These techniques are commonly used but other data transformation tech-

niques can be used if necessary with both data integration pipeline models.  

 
The transformation phase takes time in both ETL and ELT pipeline solutions as each ad-

ditional procedure adds complexity to the system increasing processing time. In both ETL 

and ELT cases the transformation phase integral as it handles the tasks of manipulating 

the existing data into a necessary format for a specific use case. Nithish et al. (2024) 

states that data transformation is the process of reconstructing raw data into an opti-

mized form that can later be used in varying ways.  

 
2.3 Data load phase 

The data load phase is the process in which data is loaded into the target system after 

other phases of the pipeline are completed. Gill (2020) expresses that the load phase is 

the transferring of data into the target system. In the article he states that these target 

systems may be data warehouses, data lakes or analytical platforms. Notable in this is 

how the target system can affect the pipeline architecture as the target systems compu-

tational capabilities and scalability possibilities affect what data integration pipelines 

should be used for the optimal results.  

 
 The load phase is important for both ETL and ELT solutions as it handles the data storage 

aspects of the data integration process. In essence, the load phase of the data integra-

tion process can be summarised as the storing of data into a target system for later use. 

Walha et al. (2024) explains the load phase in ETL pipeline as the storage of transformed 


10 

data in the target system. The article states that this stored data can be later used que-

rying and reporting. 


11 

3 ETL 

Behrend & Jörg (2010) explain the ETL pipeline as a process that first extracts data after 

which it transforms the data and finally the pipeline loads the data into the target system. 

ETL processes are used in the handling of data in many circumstances. While other meth-

ods of data cleansing exist ETL is a widely used as it has been adopted into many systems 

and in many cases, it is compatible with legacy code and data management. Dinesh & 

Devi (2024) emphasises the important role of ETL pipelines automating the data integra-

tion processes and improving efficiency for these tasks. In their article they state that 

clean standardized, clean structured data in important for analytics, reporting and deci-

sion making. 

 
3.1 ETL data processing method 

ETL processes work by gathering a large amount of data from varying sources. After the 

extraction of data, the ETL method transforms the available data into a desired form 

defined by differentiating needs and business rules. After the data is transformed into a 

suitable schema it is loaded into a data warehouse from which the data can be accessed 

and used for various purposes. Dinesh & Devi (2024) state that the automation of the 

ETL process is crucial in making the data integration process more efficient and reliable. 

ETL tools and platforms are used in the process of transforming data into a suitable for-

mat for later use. 

 
3.1.1 Data extraction process for ETL 

The data extraction process in ETL methods is comprised of gathering a large amount of 

relevant data from multiple sources. The data gathered is not usually in the final format. 

In this stage the data is gathered into a staging area where it is stored to prepare for the 

next process of transforming the data. Mandala (2019) states that the extraction process 


12 

is the collection of data from various sources. The data extracted can consist of both 

structured and unstructured data. Mandala in the article explains that the extraction 

process is caried out while the system load is minimal. This is done to mitigate the per-

formance decrease caused by the ETL pipeline processing data. Mandala in the article 

also states that raw data is extracted in bulk from the source systems to retain the effi-

ciency of the pipeline. 

 
3.1.2 Data transformation process for ETL 

The transformation process in ETL methods is used to change raw data into usable struc-

tured data that can be used for other means later. Mandala (2019) explains that the raw 

data in the staging area is cleansed and standardised to the specifications of a certain 

schema. Mandala states that the transformation process can be conducted of data du-

plication, format standardization and applying business rules to the data. Notable to this 

is that these and other transformations can be included in the transformation phase to 

get the desired structure. The transformation process takes a large amount of time and 

resources as all the gathered data needs to be transformed into the final wanted struc-

ture. In the event that the wanted data form is complex the performance of the ETL 

process may decrease as each individual transformation increases the complexity and 

resource usage of the transformation process. InfluxData (n.d.) explains that the more 

complex the transformations conducted by the pipeline the more impact they have on 

the performance of the pipeline. This is due to the fact that each transformation of the 

raw data requires a corresponding need for processing power. InfluxData describes the 

duties of the transformation phase as reformatting extracted data, improving data qual-

ity by changing or removing inaccurate and inconsistent data, cleaning datasets from 

corrupted, duplicated or otherwise incorrect data. These processes are used to change 

the extracted unstructured data into structured data. “The goal of transformation is to 

make all data fit within a uniform schema before it moves on to the last step.” (Informat-

ica, n.d.) This is because only the transformed data is used by the load phase. 

 
13 

3.1.3 Data loading process for ETL 

Mandala (2019) states that after the transformation process is completed the trans-

formed data is then transferred into the target system. In the article it is stated that run-

ning the load process is scheduled for hour in which the system traffic is lower to prevent 

performance issues. Once the transformation process is completed for the entirety of 

the data available or just for a part of the data in total the transformed data can be 

downloaded into the target system or in many cases a data warehouse. The loading pro-

cess is the final step in the ETL process and after it is complete the data within the data 

warehouse can be used for other operations. These operations can include for example 

analytics, reporting, machine learning and forecasting. For example, the processed data 

can be used in the creation of operational reports for a business. 

 
3.1.4 Data transfer process within ETL processes 

In ETL data integration pipelines the data flows from the source database to the staging 

area by the extraction process. The transformation process occurs in the staging area. 

The cleaned data is then loaded into the target system from which it can be used for 

varying purposes. 

 
14 

Figure 1 “A diagram depicting an ETL pipeline for the integration of dashboard data.” 
(Myakala, Bura & Juma, 2024). 

 
3.2 Key performance indicators 

Throughput and latency are complementary metrics as they can be used in the measur-

ing of how much data an ETL process handles and in what amount of time. Theodorou 

& Ayomide (2025) lists of ETL key performance indicators and one of them is latency. 

Latency is the delay in between user input and the response of the system. In their article 

they also explain that another key performance indicator is throughput. They explain 

that throughput is the amount of data handled by ETL operations. Arsan & Amagowni 

(2022) states that the performance of a pipeline can be quantified by measuring the 

throughput of the pipeline. 

 
3.2.1 Throughput 

Throughput is a key metric in understanding the amount of data handled by ETL pro-

cesses. Rongala (2025) states that ETL pipeline used for financial applications use 

throughput as a way to measure the amount of data handled by the pipeline in a certain 

amount of time. Throughput can be measured directly as the amount of data that has 

been processed with ETL systems. Akkaoui, Vaisman & Zim´anyi (2019) states that 

throughput is a key performance indicator in the efficiency aspect of ETL pipelines. 

 
3.2.2 Latency 

Latency is a key metric for how well a ETL process operates and how efficiently it can 

handle data. There are different types of latency with ETL processes. Adetnji (2025) de-

fines latency into different types end to end latency, processing latency and ingestion 

latency. In the article end to end latency is defined as a time taken from when the data 


15 

first arrives in the pipeline to the time it is acted upon. The article describes processing 

latency as the time taken for the extraction, transformation and loading of an output. 

The article describes ingestion latency as the delay occurring before the pipeline receives 

an event. These three types of latency within ETL processes can be used as metric to 

define what parts of the data pipeline operate at a sufficient level and what still needs 

to be improved. The understanding of latency within a system is critical in the assess-

ment of the performance of a system. The number of errors also affect the performance 

of a system and latency needs to be balanced with accuracy to achieve usable results 

from an ETL process. Rongala (2025) describes latency as the time taken for the whole 

pipeline to process and store the data. This difference shows that latency can be meas-

ured in different ways in data integration pipelines. 

 
3.3 Optimization of ETL models 

ETL tool optimization can be considered to be done in three stages which are extracting 

data, the processing of data into a suitable form and loading the data into a storage sys-

tem such as a data warehouse. Each of these steps should be optimized for the best 

possible system for a use when considering performance. The optimization of ETL tools 

is a large contributor of how well the ETL data pipeline operates. In many cases the most 

basic design can affect how much time is spent for the operations to run through. This 

may take a considerably long time when large data quantities are being processed. To 

decrease this wait time and getting the transformed data faster batch ETL is a good ap-

proach to consider. Batch ETL processes large data quantities in batches and thus in-

creases transformed data availability. Walha, Ghozzi and Gargouri (2024) imply that data 

becomes available only after the ETL process is completed and the result of the process 

is a unified dataset. Other ETL optimization techniques are real time ETL and micro batch 

ETL. These techniques are used to balance throughput and latency for systems. The 

choice between what technique is dependent on the needs of connected systems and 

the user. 

 
16 

3.3.1 Data extraction optimization 

A way of optimizing the extraction process of data for ETL is applying incremental loading 

techniques. Incremental extraction techniques work on the premise that loading only 

the data that has changed is faster and more resource efficient than extracting a total 

duplicate of the data already stored in the system. Oracle (n.d.) describes incremental 

extraction as only extracting data that has change in comparison to a known point in 

time. An example of incremental extraction is the leveraging of the timestamp of data. 

During the first run of the ETL pipeline all timestamps are the same and all data is ex-

tracted. On later runs of the pipeline incremental extraction compares the timestamps 

of the data to be extracted in comparison to the timestamps previously extracted data. 

Incremental extraction only extracts newer data from the source system. This process 

decreases the amount of data extracted and thus allowing for faster processing of the 

extraction phase.  

 
Partitioning data is also a valuable optimization technique. It divides data into partitions 

based for example on time, business keys or source. This technique achieves efficiency 

optimization by discarding unnecessary partitions and thus reducing the systems over-

head. “The framework first partitions an ETL dataflow into multiple execution trees ac-

cording to the characteristics of ETL constructs, then within an execution tree pipelined 

parallelism and shared cache are used to optimize the partitioned dataflow.” (Liu & 

Iftikhar, 2015). 

 
3.3.2 Data transformation optimization 

Data transformation optimization is a key part of how well an ETL model operates. This 

is due to the fact that each data transformation adds complexity to an ELT algorithm. The 

added complexity affects the amount of time taken and resources used to complete the 

processing of data with the ETL software. Oracle (n.d.) states that the transformations 

for the data are in many cases the most time consuming and complex part of ETL pipeline. 


17 

ETL models handle the data transformation processes before the data is loaded into the 

target system. This increases the latency of the system as a whole and increases pro-

cessing times.  

 
ETL data integration pipeline models use parallelization of transformation tasks to opti-

mize the transformation process. In parallelization the transformation tasks are divided 

among multiple threads to accomplish optimization of the model in the performance 

aspect. Incremental and partitioning techniques can also be used in the transformation 

phase of the ETL process. Other techniques can be used to optimize the transformation 

phase as well. Kumari (2017) gives examples of ETL performance optimization tech-

niques such as pushdown optimization, partitioning and parallelization among other 

techniques. 

 
3.3.3 Data loading optimization 

There are various ways to optimize the load process in ETL pipelines. One of these opti-

mization possibilities is to use multithreaded or parallel loading with optimally selected 

batch size load processes. Oracle (2009) states that parallel execution processes are a 

viable optimization technique for optimizing an ETL pipeline.  

 
Another way of optimizing the load procedure is to use shared cache memory along with 

the ability to use multiple cores for the procedure. The main idea behind this optimiza-

tion route is to reduce the amount of memory overhead along with reducing the amount 

of redundant copies. Through this method the load process as a whole can be optimized 

is the performance aspect. Masouleh, Afshar, Alborzi & Toloie (2016) states that an ETL 

pipeline performance can be optimized considerably by using shared cache memory and 

parallel processing. These and other methods can be used in conjunction or separately 

for the optimization of the pipeline. 

 
18 

3.4 Relevance of ETL tools in data architecture 

ETL processes are a key piece of data architecture. This is especially clear when consid-

ering the aspects of big data and data warehousing. These large projects regularly consist 

of cleansing and optimizing data that needs to be used later on. Dhaouadi et al. (2022) 

states that 80% of the time spent on a data warehouse project is used for extracting 

cleaning and loading data when the time taken on inherent issued of Big Data is not 

considered. 


19 

4 ELT 

ELT is a data pipeline process where data is first extracted from a number of sources. 

After the extraction process the raw data is loaded into a target system. Examples of 

target systems can be data warehouses or data lakes. Ballard et al. (2011) states that the 

ELT process first extracts the data after which it is loaded into a data warehousing envi-

ronment. After the data is loaded into a target system the raw data is transformed into 

a suitable form. Google Cloud (n.d.) states that the target system of the ELT pipeline can 

be a data lake or a cloud data warehouse. The transformation process of the data is com-

pleted using the target systems operational power. In the article Google Cloud stated 

emphasises that ELT pipelines leverage the computational resources of the target system 

to conduct the transformations on the raw data. 

 
4.1 ELT data processing method 

The ELT data processing method consists of three primary processes. These processes 

are the data extraction process, the data loading process and the data transformation 

process.  

 
The loading process is the second phase of the ELT process. With this phase new data is 

loaded into a target system. “In the second step, the extracted raw data is loaded, often 

in its original format or with minimal processing, directly into a high-capacity storage 

system.” (Google Cloud, n.d.) The final phase of the ELT pipeline is the transformation 

processes performed onto the preloaded data within the target system. 

 
4.1.1 Data extraction process for ELT 

The data extraction process in the ELT pipeline is the first phase of the process. In this 

step the main objective is to retrieve useful data from source systems and files which in 

many cases are not structured. Spooner (2011) describes the extraction process as the 


20 

accessing of a source system from which only the data used in the ETL pipeline is ex-

tracted. This is useful as it filters out unnecessary data. This phase handles data gathering 

for future phases where the raw data is stored and processed. 

 
4.1.2 Data load process for ELT 

The data load process is the phase of the ELT pipeline in which the previously extracted 

data is loaded into a target system. This phase occurs directly after the raw data is ob-

tained in the extraction phase. Seenivasan (2022) states that the load process transfers 

the raw data into a target system where the raw data can be stored. These target systems 

may vary but a common feature of the target storage systems is that they have innately 

a large amount of processing power or have access directly to a large amount of pro-

cessing power. Various cloud storage systems, for example, can be used for these pur-

poses. Seenivasan (2022) also states that the raw data can also be copied into a staging 

area or directly into the target system. This can allow for more flexibility when designing 

ELT pipelines. Overall, the load process is responsible for the transfer of data into a stor-

age device and the storing of the data itself. 

 
4.1.3 Data transformation process for ELT 

In this stage of the ELT data integration pipeline the data that has been obtained and 

stored in previous phases is then transformed into a suitable form. In a Google cloud 

(n.d.) article the transformation process is described as the final step of the ELT pipeline. 

The article also states that in ELT data integration pipelines the transformation process 

occurs within the target system. The transformation process is mainly responsible for the 

cleansing, standardization, normalization, enrichment, modeling, joining and validation. 

In a Google Cloud (n.d.) article the transformation process is described to be used for 

the purposes of cleaning, structuring, enriching raw data while converting it into a de-

sired format. The article states that this process can be used for analytics, reporting or 


21 

for machine learning needs. The article also describes possible transformations con-

ducted during the transformation process these include for example filtering data, join-

ing data, data aggregation, data format standardization and the derivation of new data 

points. These methods can be applied to the raw data obtained in the extraction phase 

of the data integration pipeline. The transformations mentioned can be used to manip-

ulate varying types of raw data into structured data usable in the future. The proper use 

of each method of transforming the raw data is key in obtaining usable material for fu-

ture use. 

 
The data that is to be transformed is already within a suitable target system that can 

handle the computation for the data transformation process. Seenivasan (2022) de-

scribes the transformation process as the level where the transformations are done. 

While he in his article he explains the data transformation happens within the target 

system as the target system of and ELT pipeline owns the necessary computation capa-

bility to handle intense transformations of the data. This is exemplified with data stored 

within cloud-based systems as they commonly have considerable computational re-

sources allowing ELT pipelines to operate smoothly within them.  

 
The main goal of the transformation process is to change raw data into clean and usable 

data for future processing. This is an important step as the raw data does not fit all sys-

tem requirements and business rules. 

 
4.1.4 Data transfer process within ELT processes 

The ELT data integration pipeline processes data in the following order. First data is ex-

tracted from varying sources. The extracted data is then loaded directly into the target 

system. This target system uses its processing power to transform the data into a 

cleansed form which it can be used for varying analytics. 

 
22 

 
Figure 2 "ELT Process (Qlik, n.d.)". 

 
4.2 Key performance indicators 

The key performance indicators that can be used to determine the effective usage of ELT 

systems are varied but the main indicators considered in this review are throughput and 

latency. These indicators have been chosen to focus on because they can be used to 

determine the efficiency of ELT data integration pipelines. Allwell et al. (2025) describes 

that latency is the delay of time from the start of the process to the throughput of the 

data. In this article the throughput of data is described as the volume of data processed 

by a system in a certain amount of time. Both throughput and latency can be inferred as 

key performance indicators for ELT data integration pipelines as the they are designed to 

handle the largest amount data in the shortest amount of time. 

 
23 

4.2.1 Throughput 

Throughput measures the amount of data processed in a certain amount of time. This 

metric is important in understanding the efficiency of ELT data pipeline solutions as it 

gives clear information on how quickly the pipeline can handle a certain amount of data. 

Allwell et al. (2025) in their article consider data throughput as the data volume a pipe-

line can process in a given time frame. Throughput is a key performance indicator in all 

types of ETL pipelines, but real-time pipeline models emphasize its role in optimization 

of the pipeline. Throughput is measured for the whole processing of the pipeline includ-

ing the extraction, load and transformation phases. A large factor in how large the final 

throughput for an ELT pipeline solution is the complexity of the transformation phase. 

This is due to each additional transformation increasing the pipelines complexity. With 

the increase of complexity, the processing time increases affecting the throughput if 

proper optimization techniques such as parallel processing are not used. Owen 2024 

states that if a pipeline does not use optimization techniques the pipeline’s overall per-

formance decreases and that the performance in such cases is affected noticeably by the 

complexity of the transformation phase. A higher throughput for a functional pipeline 

means the efficiency is better in comparison than a pipeline with a lower throughput 

while both pipelines function identically in all other aspects. 

 
4.2.2 Latency 

Latency is the end to end measurement of how long it takes for the pipeline to operate 

fully and the processed data to be available for use. In the case of ETL pipelines this 

means that latency is the entire amount of time taken by the extraction phase, load 

phase and the transformation phase overall.  

 
Latency can be measured for each individual phase of the pipeline or as a whole for ex-

ample the time latency for the set transformations to operate may be named as trans-

formation latency. Another form of latency is ingestion latency. Dupe et al. (2025) states 


24 

that ingestion latency is the time delay between the beginning of the whole process to 

the first event of a pipeline. Latency is a key metric in determining the efficiency of the 

data integration pipeline as the time taken for processing data may be long. With know-

ing the latency of a data integration pipeline, the appropriate solution may be taken on 

a case-by-case basis. Latency is largely affected by the way a data integration pipeline is 

optimized. By optimizing a data pipeline, the latency of the system can be changed to fit 

different needs. 

 
4.3 Optimization of ELT models 

The optimization of an ELT data integration pipeline model can be divided into three 

different categories based on the phase of the pipeline. A pipeline should be optimized 

as a whole as well as the parts of the pipeline individually. Each optimization needs to 

be planned beforehand to maximize the benefits the optimization can provide. The op-

timization of a pipeline can affect the performance of a model considerably. Zvonarev et 

al. (2023) in an article state that while that the amount of optimization methods pro-

posed for ETL pipelines is lower than optimization methods proposed for ETL pipelines 

many of the optimization models proposed can be used in both data integration pipeline 

models. 

 
4.3.1 Data extraction process optimization 

The data extraction phase of the data integration pipeline should be optimized as the 

extracted data is key in defining the possible uses of the data and scalability based of the 

systems using the final data. Optimization of the extraction process can also impact the 

performance of the pipeline.  

 
The extraction process can be optimized in many ways. Examples of the optimization 

related techniques are change data capture, parallel extraction, batch sizing, 


25 

compressing data for the transfer, incremental extraction and extracting only relevant 

data. Change data capture operates on the idea to not extract data that the system al-

ready has. Batch sizing on the other hand can be used to tune the model for optimal 

through-put by changing the size of each batch being processed. 

 
4.3.2 Data load process optimization 

The data load phase of the ELT data integration pipeline should be optimized as it is re-

sponsible for the writing of data into the target system of the pipeline. By optimizing this 

process the system can handle larger amounts of data in less time.  

 
The optimization of the load process can be done in many ways and the chosen optimi-

zation technique needs to be chosen on a case-by-case basis. The load process for ELT 

pipeline solutions can be done by partitioning the load process, partitioning the ex-

tracted data, compacting small files into larger files and using clustering methods. Rich-

man (2024) states that by applying incremental load strategies in data storage systems 

the time taken by the system to load the data can be decreased and use of computational 

resources can be lessened. 

 
4.3.3 Data transformation process optimization 

Zvonarev et al. (2023) state that optimization of the transformation process is a key part 

in the optimization of the data integration pipeline. I their article they state that execut-

ing the transformation processes directly within the target system the transformation 

phase as a whole can be significantly optimized. In their article Zvonarev et al. also pro-

vide other optimization techniques such as batch processing, incremental transfor-

mations and parallel processing among other techniques optimize the transformation 

process. An example of transformation optimization techniques is data layout optimiza-

tion. Data layout optimization is a technique where the process of data storage is 


26 

optimized meaning data partitioning, clustering and key distribution. With these meth-

ods data layout optimization can make the transformation process more efficient. Oracle 

(2024) states that throughput can be increased by per- forming the SQL operations in 

batches. With the use of these techniques the transformation process can be optimized 

for various circumstances including the optimization of the efficiency of the data pipeline. 

 
4.4 Relevance of ELT tools in data architecture 

ELT data integration pipelines are highly relevant in data architecture and the overall de-

sign of a functional system as they handle a large amount of data gathering, storing and 

normalization operations a system handles. ELT data integration pipeline solutions are 

becoming more prevalent with the adaptation of new technology. Garcia (2023) notes in 

his article that cloud-based data storage systems such as Snowflake or BigQuery can be 

easily scaled to meet the requirements of a system and that they can be optimized to 

process data within the storage system itself. This can improve the throughput and la-

tency of a pipeline if designed properly. ELT solutions use the computation power of the 

storage systems. This increases the performance of the pipeline models while also sim-

plifying the architecture of the data integration pipeline. When using an ELT data inte-

gration pipeline the gathered data is both centralized and democratized as raw and 

trans-formed data are included in the same storage system. 


27 

5 Comparison between ETL and ELT tools 

This chapter aims to provide a clearer understanding of the ETL and ELT data integration 

pipeline models through comparison. This chapter aims to analyse what similarities and 

differences the ETL and ELT pipeline models include. 

 
5.1 Overall process comparison 

The goal itself of both ETL and ELT tools is to transfer and transform data from source 

systems to target systems and to transform the data into a desired form. With automa-

tion the ETL and ELT processes data integration can be accelerated to meet needs of data 

analytics. The main difference of these tools is how the process is handled. While the 

extraction of data with both models is similar the transforming and loading components 

vary considerably. 

 
5.2 Process comparison between ETL and ELT tools 

The largest difference between ETL and ELT data integration pipelines is the operation 

pattern the each of them uses. In these two pipeline models the transformation phase 

occurs at different points. In ETL pipeline models the transformation of the data occurs 

immediately after the extraction phase in a separate staging area, after which the trans-

formed data is loaded into the target system. On the other hand, in the ELT data pipeline 

model the transformation phase occurs after the raw data is loaded into the target sys-

tem. The transformation of the raw data occurs directly within the target system lever-

aging the systems inherent processing power. The pipeline models differ in their struc-

ture due to the inherent differences in their target systems, business rules, approaches 

on transforming data and data volume. The inherent difference in target system forces 

an ETL pipeline to complete the transformation phase before the data is loaded into the 

target system as the target system does not have the processing power to complete the 


28 

necessary transformations. In comparison an ELT pipeline with a target system that has 

considerable processing power can handle the needed transformations. Another exam-

ple is the difference in data quality control ETL pipelines transform. 

 
5.2.1 Extraction process review 

From an operational aspect the data extraction process is used for the data collection in 

both pipeline models. Data is collected from varying sources to be used at a later stage 

in the pipeline. The extraction process is completed in both pipeline models as the first 

stage of the data integration pipelines. Google Cloud (n.d.) article states that the extrac-

tion process only gathers the data without using it for other purposes. The extraction 

process philosophy is slightly different within the data pipelines. While both pipeline 

models can handle differing quality of data the ETL model more easily operates if the 

extracted data is cleaner beforehand in comparison to ELT data integration pipeline mod-

els. This also affects the amount of data each pipeline model can accommodate with 

ease as cleaner and more optimized data takes less memory to store. Overall, the data 

extraction process in both data integration pipeline models is used for the same purpose 

of data gathering. The extraction process is completed in the same way in both pipeline 

models and in both models is the first stage used in preparation so the following stages 

can be completed.  

 
The optimization of ETL and ELT data integration pipelines for the aspect of efficiency is 

similar as many of the techniques work for both models. One of the key optimization 

techniques that can be used in the optimization process in both techniques is CDC or 

incremental loading. CDC or change data capture means that only the changes in the 

sourced data are extracted instead of all source data each time. Agarwal (2025) explain 

the incremental data load as a method in which only data that has changed after the 

latest process is complete is selected and used. This can be leveraged to gain efficiency 

for during the pipeline as the same data does not need to be reworked allowing to better 

use a systems resources. 


29 

 
5.2.2 Transformation process review 

The transformation process is the core of ETL and ELT data integration pipeline processes 

as it is directly responsible for transforming raw data into a clean, consistent and usable 

form. For both ETL and ELT data integration pipeline models the task the transformation 

phase is used for is the same. The key difference between the transformation phase in 

the two pipeline models is the location where the transformation phase occurs and at 

what point in the pipelines the transformation phase is performed. In the ETL pipeline 

model the transformation phase is performed in a separate staging area before the trans- 

formed data is loaded into the target system. On the other hand, for the ELT pipeline 

model the transformation phase is performed in the target system itself after the load 

process. The memory usage efficiency is notable as the staging area in ETL data pipeline 

model can bottleneck the efficiency of the transformation process. This is affected by 

the usage rate and overall amount of staging area memory. This is not so much a concern 

for the ELT data pipeline model as the transformation phase occurs directly in the target 

system. This can allow the ELT model to dismiss concerns of the staging area, but other 

bottlenecks such as the amount of processing power still affects the transformation 

phase. Qlik (n.d.) states that an ELT system works on an as-needed basis. ELT pipelines 

integrate data only when queried. This can cause delays for the future needs of the sys-

tem. 

 
The optimization of the transformation process is different between the two data inte-

gration pipeline models. ETL data integration pipelines achieve transformation process 

optimization through process level optimizations. Such as parallelization of transfor-

mation processes, incremental transformation processes and partitioning are all used to 

optimize the ETL pipelines transformation process. The ELT data integration pipeline 

transformation process is optimized with pushdown optimization, using inbuilt tools in 

the storage system and batch processing among other techniques. The ELT pipeline uses 

in-database optimizations such as query optimization, index optimization and in-data-

base processing to improve the efficiency of the transformation process. These 


30 

optimizations include Garcia (2023) states that a data integration pipeline can use the 

computational capabilities in the pursuit of lower cost and increased performance. No-

table in this is the need for data governance. The data security of systems that operate 

on the principle of in target system data transformation the security aspects may be 

lacking for certain purposes. As data arrives in the target system without prior transfor-

mation sensitive data such as credentials may be transferred in an unencrypted or un-

masked state. This can create data security concerns as security controls are used only 

after the data is in the target system. 

 
5.2.3 Load process review 

The load process is the phase in which both ETL and ELT data integration pipeline models 

load data to be stored into the target system. In ETL pipeline models the loaded data is 

cleansed while in the ELT pipeline the data is raw directly extracted from the source sys-

tems. Thus, in many cases the load process complexity is higher with ELT pipeline models. 

Snowflake (n.d.) compares the load process in ETL pipeline models to be more agile than 

in ETL pipeline model. In the article Snowflake emphasizes the advantages of also loading 

raw data directly into the target system. These advantages include the ability to ingest 

data in a raw form, decreasing the need to transfer data between multiple pipelines and 

enabling the pipeline to handle large data volumes. This difference can allow for more 

flexibility in later usage. With ELT pipeline models the memory usage is heavier as both 

raw and transformed data can be stored in comparison to ETL pipeline in which only 

transformed data is commonly stored. The key similarity of ETL and ELT pipeline models 

in the load process is the purpose for which the load process is used. The key differences 

are the order in which the phase is executed, load complexity and load target system.  

 
The load process can be optimized in varying ways for both ETL and ELT data integration 

pipeline models. These optimizations strategies include but are not limited to using in-

cremental loads, partitioning, clustering, parallel processing and other optimization 

techniques. Agarwal (2025) notes the possible optimization of data integration pipelines 


31 

with the incremental load strategy to be capable of improving the load process perfor-

mance. The layout of the data can also be used in optimizing the load process. Overall, 

the optimization of the load process can aid in the optimization of the entire pipeline. 

 
5.3 Suitability for different environments and use cases 

ETL and ELT differ in the aspect of what environments each model is optimal for. The use 

cases of these data integration pipeline models also are different. It must be noted that 

there is considerable overlap in what these systems can be adapted to handle. The dif-

ferences between the ETL and ELT data integration pipeline models are most notable 

when an on-premises system needs to be implemented. The more optimal approach 

when an on-premises data integration pipeline needs to be designed is an ETL pipeline. 

This is be-cause ETL pipeline models can handle the data in a specific system without 

using cloud storage or other offsite data storage systems. Qlik (n.d.) states that sensitive 

data can be left out before the transformation process in ETL pipeline models. This can 

be very useful when considering the systems compliance with local laws. In the article 

Qlik provides examples of standards where the ability to leave out sensitive data is nec-

essary. These standards can include GDPR or HIPAA for example. The ETL pipeline model 

can also operate complex transformation logic without having to use SQL or other trans-

formation logic engines. 

 
ELT data integration pipeline models can handle a large quantity of data with the use of 

computational power of the data storage system in use. This is because ELT systems can 

leverage the cloud storage systems where the data is stored to operate the transfor-

mation logic. Google Cloud (n.d.) explains how data lakes can be used as a viable target 

system as they are designed to handle large amounts of data while also having the com-

putational capability to handle ELT pipelines. Use cases in which scalability need due to 

a large quantity of data is also optimal for ELT data integration pipeline models as the 

scalability needs can more easily be accommodated with the aid of cloud data storage. 


32 

6 Conclusions 

This study gives a comprehensive review of ETL and ELT data integration pipelines. It is a 

literature review that uses as sources both industry articles and academic articles. The 

study begins with an introduction to what ETL and ELT data integration pipelines are fol-

lowed by further study into how they process data and how the pipelines can be opti-

mized in the efficiency aspects. After this the processes are compared on what similari-

ties they have and what differences they include.  

 
Both ETL and ELT pipelines include three main phases. These three phases are the ex-

traction process, the transformation process and the load process. The extraction pro-

cess is similar in both processes as raw data is extracted from varying source systems. 

The transformation process handles the same task in both pipeline models, but the tools 

used in this process are different between the pipeline models. The load process is used 

to store data into the target system. With ETL pipelines this is the final phase of the pipe-

line while with ELT pipelines the load process precedes the transformation process. Dif-

ferences in the phase structures are the clearest distinctions between the two pipeline 

models. 

 
The differences in ETL and ELT data integration pipelines are the result of deviating de-

sign in hardware, software and the optimization of the models. The hardware used by 

ETL pipelines as the target systems can for example be a data warehouse with limited 

processing power. The hardware used by ELT pipelines is a cloud storage system that 

includes notable processing power. This inherent difference in capability of transforming 

loaded data forces ETL pipelines to transform the raw data outside the target system in 

a staging area before loading the curated data into the target system. An ELT pipeline can 

use the processing resources of the target system to transform raw data. These differ-

ences caused by the hardware also affect the software as ETL pipelines are designed in 

a way that allows for the transformation of a large amount of data while having limited 

processing power in the target system. The software of ELT pipelines is designed to take 

advantage of the processing power of the target system allowing large amounts of raw 


33 

data to be loaded directly into the target system. The differences in the available optimi-

zation techniques come from the differences between pipeline architecture and target 

systems themselves. Some of these optimization techniques can be used in one or both 

pipeline models. Incremental extraction can be used in both pipelines to decrease 

memory usage as with incremental extraction only changed and new data that the pipe-

line has not already processed is extracted from the source system. SQL batch execution 

is used in ELT pipelines to optimize the transformation phase by grouping and executing 

multiple transformations at once. With this technique the pipelines throughput can be 

increased with better CPU utilization. The load phase in ETL pipelines can be optimized 

with optimized batch size loading. Optimized batch size loading combines data into 

batches of rows that are loaded into the target system as one operation. This can max-

imize throughput by decreasing loading overhead. The usage of optimization techniques 

can improve the performance of both ETL and ELT data integration pipelines considerably. 


34 

References 

Agarwal Sanchit, (2025). ETL Incremental Loading 101: A Comprehensive Guide, Hevodata, 

https://hevodata.com/learn/etl-incremental/  

Arsan Roy & Amagowni Sudesh, (2022). Benchmarking your Dataflow jobs for performance, 

cost and capacity planning, Google Cloud, https://cloud.google.com/blog/prod-

ucts/data-analytics/benchmarking-dataflow-jobs Retrieved 7.11.2025 

Ayomide Joel, (2025). Optimizing data latency and throughput in etl processes through rein-

forcement learning, ResearchGate, https://www.researchgate.net/publica-

tion/398931364_OPTIMIZING_DATA_LATENCY_AND_THROUGHPUT_IN_ETL_PRO-

CESSES_THROUGH_REINFORCEMENT_LEARNING  

Ballard Chuck, Gomes Veronica, Hilz Gregory, Panthagani Manjula & Samuelsen Claus, (2011). 

Data Warehousing with the Informix Dynamic Server, 

www.redbooks.ibm.com/redbooks/pdfs/sg247788.pdf  

Behrend Andreas & Jörg Thomas, (2010). Optimized incremental ETL jobs for maintaining data 

warehouses, Conference proceeding https://doi.org/10.1145/1866480.1866511 

Dhaouadi, A., Bousselmi, K., Gammoudi, M. M., Monnet, S., & Hammoudi, S. (2022). Data 

Warehousing Process Modeling from Classical Approaches to New Trends: Main Features 

and Comparisons. Data, 7(8), 113. https://doi.org/10.3390/data7080113  

Dinesh, L., Devi, K.G. An efficient hybrid optimization of ETL process in data warehouse of cloud 

architecture. J Cloud Comp 13, 12 (2024). https://doi.org/10.1186/s13677-023-00571-y 

Dupe Adetunji, Dorcas, Sheng Bou & Sulaimon Tunde, (2025). Latency vs. Accuracy: Balancing 

Performance with Data Integrity in Low-Latency Streaming ETL Workflows, 

https://www.researchgate.net/publication/392876455_Latency_vs_Accuracy_Balanc-

ing_Performance_with_Data_Integrity_in_Low-Latency_Streaming_ETL_Workflows 

 El Akkaoui Zineb, Vaisman Alejandro & Zim´anyi Esteban, (2019). A Quality-based ETL Design 

Evaluation Framework, In Proceedings of the 21st International Conference on Enter-

prise Information Systems (ICEIS 2019), pages 249-257, 

https://doi.org/10.5220/0007786502490257  

Shaker El-Sappagh H. Ali, Abdeltawab Ahmed M. Hendawi & El Bastawissy Ali Hamed, (2011). 

“A proposed model for data warehouse ETL processes” Journal of King Saud University - 

https://hevodata.com/learn/etl-incremental/
https://cloud.google.com/blog/products/data-analytics/benchmarking-dataflow-jobs%20Retrieved%207.11.2025
https://cloud.google.com/blog/products/data-analytics/benchmarking-dataflow-jobs%20Retrieved%207.11.2025
https://www.researchgate.net/publication/398931364_OPTIMIZING_DATA_LATENCY_AND_THROUGHPUT_IN_ETL_PROCESSES_THROUGH_REINFORCEMENT_LEARNING
https://www.researchgate.net/publication/398931364_OPTIMIZING_DATA_LATENCY_AND_THROUGHPUT_IN_ETL_PROCESSES_THROUGH_REINFORCEMENT_LEARNING
https://www.researchgate.net/publication/398931364_OPTIMIZING_DATA_LATENCY_AND_THROUGHPUT_IN_ETL_PROCESSES_THROUGH_REINFORCEMENT_LEARNING
http://www.redbooks.ibm.com/redbooks/pdfs/sg247788.pdf
https://doi.org/10.1145/1866480.1866511
https://doi.org/10.3390/data7080113
https://www.researchgate.net/publication/392876455_Latency_vs_Accuracy_Balancing_Performance_with_Data_Integrity_in_Low-Latency_Streaming_ETL_Workflows
https://www.researchgate.net/publication/392876455_Latency_vs_Accuracy_Balancing_Performance_with_Data_Integrity_in_Low-Latency_Streaming_ETL_Workflows
https://doi.org/10.5220/0007786502490257


35 

Computer and Information Sciences, Volume 23, Issue 2, July 2011, Pages 91-104 

https://doi.org/10.1016/j.jksuci.2011.05.005 

Faridi Masouleh M., Afshar Kazemi M. A., Alborzi M. & Toloie Eshlaghy A., (2016). Optimization 

of ETL Process in Data Warehouse Through a Combination of Parallelization and Shared 

Cache Memory, Engineering, Technology & Applied Science Research, 

https://etasr.com/index.php/ETASR/article/view/849  

Garcia Miguel, (2023). “The Evolution of Data Pipelines: ETL, ELT, and the Rise of Reverse ETL”, 

https://dzone.com/articles/the-evolution-of-data-pipelines Retrieved 7.11.2025 

Gill Sukhdeep, (2020). “The First Step to Business Intelligence: Ensuring Data Quality Through 

Rigorous ETL Processes” International Journal of Trend in Research and Development, 

Volume 7(4), 2020, http://www.ijtrd.com/papers/IJTRD28919.pdf 

Google Cloud, (n.d.). What is ELT (extract, load, and transform)?, Google Cloud, 

https://cloud.google.com/discover/what-is-elt?  

InfluxData, (n.d.). ELT (Extract, Transform, Load), InfluxData glossary, https://www.in-

fluxdata.com/glossary/etl/ Retrieved 7.11.2025 

Informatica, (n.d.). What is ETL (extract transform load)?, Informatica, Retrieved 24.9.2025, 

from https://www.informatica.com/resources/articles/what-is-etl.html Retrieved 

7.11.2025 

Kumari Deepika, (2017). Performance Optimization of ETL Process, ResearchGate, 

https://doi.org/10.13140/RG.2.2.13994.44480   

Liu, X., & Iftikhar, N. (2015). An ETL Optimization Framework Using Partitioning and Paralleliza-

tion. In Proceedings of the 30th ACM Symposium on Applied Computing (SAC 2015) As-

sociation for Computing Machinery. https://doi.org/10.1145/2695664.2695846 

Mandala Nishanth Reddy, (2019). The evolution of ETL architecture: From traditional data ware-

housing to real-time data integration, World Journal of Advanced Research and Reviews, 

2019, 01(03), 073–084, https://doi.org/10.30574/wjarr.2019.1.3.0033 

Myakala Praveen Kumar, Bura Chiranjeevi & Juma Russell, (2024). Interactive Data Dashboards: 

Design Principles, Best Practices, and Applications , ResearchGate, 

https://doi.org/10.13140/RG.2.2.14205.06882 

https://doi.org/10.1016/j.jksuci.2011.05.005
https://etasr.com/index.php/ETASR/article/view/849
https://dzone.com/articles/the-evolution-of-data-pipelines
https://www.ijtrd.com/papers/IJTRD28919.pdf
https://cloud.google.com/discover/what-is-elt
https://www.influxdata.com/glossary/etl/
https://www.influxdata.com/glossary/etl/
https://www.informatica.com/resources/articles/what-is-etl.html%20Retrieved%207.11.2025
https://www.informatica.com/resources/articles/what-is-etl.html%20Retrieved%207.11.2025
https://doi.org/10.13140/RG.2.2.13994.44480
https://doi.org/10.30574/wjarr.2019.1.3.0033
https://doi.org/10.13140/RG.2.2.14205.06882


36 

Nithish, Ravi & David, (2024). "Data Transformation Techniques in ETL" International Journal of 

Multidisciplinary on Science and Management, Vol. 1, No. 2, pp. 01-16, 2024. Retrieved 

from https://www.ijmsm.org/ijmsm-v1i2p101.html (11.2025) 

Oracle, (n.d.). Introduction to Extraction Methods in Data Warehouses, https://docs.ora-

cle.com/cd/B28359_01/server.111/b28313/extract.htm  

Oracle, (2009). Parallel Capabilities of Oracle Data Pump, An Oracle White Paper, 

https://www.oracle.com/technetwork/database/datapump11g2009-parallel-1-

132209.pdf Retrieved 7.11.2025 

Oracle, (2024). “Oracle® Fusion Middleware Using Oracle GoldenGate for Big Data”, Oracle Fu-

sion Middleware Using Oracle GoldenGate for Big Data, Release 19c (19.1.0.0) 

https://docs.oracle.com/en/middleware/goldengate/big-data/19.1/gadbd/using-ora-

cle-goldengate-big-data.pdf retrieved 7.11.2025 

Owen Benjamin, (2024). Optimization of ETL/ELT Pipelines in High-Volume Data Platforms, Re-

searchGate,   https://www.researchgate.net/publication/398467239_Optimiza-

tion_of_ETLELT_Pipelines_in_High-Volume_Data_Platforms 

Qlik, (n.d.). “ETL vs ELT” retrieved 7.11.2025 https://www.qlik.com/us/etl/etl-vs-elt?  

Qlik, (n.d.).  What is ELT?, Qlik, https://www.qlik.com/us/elt retrieved 13.11.2025 

Raj A., Bosch J., Olsson H., et al., (2020). Modelling Data Pipelines. Proceedings -46th Euromicro 

Conference on Software Engineering and Advanced Applications, SEAA 2020:13-20. 

http://doi.org/10.1109/SEAA51224.2020.00014  

Richman Jeffrey, (2024). “How to Load Data into a Data Warehouse: Methods & Challenges”, 

https://estuary.dev/blog/how-to-load-data-into-data-warehouse/ Retrieved 7.11.2025 

Rongala Samyukta, (2025). Optimizing ETL Processes for High-Volume Data Warehousing in Fi-

nancial Applications, Journal of Information Systems Engineering and Management, 

2025, 10(8s), e-ISSN: 2468-4376, https://doi.org/10.52783/jisem.v10i8s.1130  

Seenivasan Dhamotharan, (2022). “ETL vs ELT: Choosing the right approach for your data ware-

house”, International Journal for Research Trends and Innovation(IJRTI),Vol.7, Issue 2, 

page no.110 – 122  

Snowflake, (n.d.). What Is ELT (Extract, Load, Transform)?, Snowflake site, http://www.snow-

flake.com/en/fundamentals/understanding-extract-load-transform-elt 

https://www.ijmsm.org/ijmsm-v1i2p101.html
https://docs.oracle.com/cd/B28359_01/server.111/b28313/extract.htm
https://docs.oracle.com/cd/B28359_01/server.111/b28313/extract.htm
https://www.oracle.com/technetwork/database/datapump11g2009-parallel-1-132209.pdf
https://www.oracle.com/technetwork/database/datapump11g2009-parallel-1-132209.pdf
https://docs.oracle.com/en/middleware/goldengate/big-data/19.1/gadbd/using-oracle-goldengate-big-data.pdf
https://docs.oracle.com/en/middleware/goldengate/big-data/19.1/gadbd/using-oracle-goldengate-big-data.pdf
https://www.researchgate.net/publication/398467239_Optimization_of_ETLELT_Pipelines_in_High-Volume_Data_Platforms
https://www.researchgate.net/publication/398467239_Optimization_of_ETLELT_Pipelines_in_High-Volume_Data_Platforms
https://www.qlik.com/us/etl/etl-vs-elt
https://www.qlik.com/us/elt
http://doi.org/10.1109/SEAA51224.2020.00014
https://estuary.dev/blog/how-to-load-data-into-data-warehouse/
https://doi.org/10.52783/jisem.v10i8s.1130
http://www.snowflake.com/en/fundamentals/understanding-extract-load-transform-elt
http://www.snowflake.com/en/fundamentals/understanding-extract-load-transform-elt


37 

Spooner John, (2011). Creating a SAS® Model Factory Using In-Database Analytics, SAS Global 

Forum 2011, Data Mining and Text Analytics https://support.sas.com/resources/pa-

pers/proceedings11/147-2011.pdf  

Theodorou Vasileios, Abelló Alberto, Lehner Wolfgang, Thiele Maik, (2017). Frequent patterns 

in ETL workflows: An empirical approach, Data & Knowledge Engineering, Volume 112, 

2017, Pages 1-16, https://doi.org/10.1016/j.datak.2017.08.004  

Walha Afef, Ghozzi Faiza & Gargouri Faiez, (2024). “Data integration from traditional to big data: 

main features and comparisons of ETL approaches”, The Journal of supercomputing 2024, 

Vol.80 (19), p.26687-26725, https://doi.org/10.1007/s11227-024-06413-1 

Zvonarev Aleksei E., Gudilin Dmitriy S., Lychagin Dmitriy A. & Goryachkin Boris S., (2023). “Ex-

tract-Load-Transform (ELT) Process Runtime Analysis and Optimization”, 

https://doi.org/10.1109/REEPE57272.2023.10086728  

https://support.sas.com/resources/papers/proceedings11/147-2011.pdf
https://support.sas.com/resources/papers/proceedings11/147-2011.pdf
https://doi.org/10.1016/j.datak.2017.08.004
https://tritonia.finna.fi/Primo/Search?type=AllFields&lookfor=%22The%20Journal%20of%20supercomputing%22
https://doi.org/10.1007/s11227-024-06413-1
https://doi.org/10.1109/REEPE57272.2023.10086728