Karri Koivula Discovering the potential of utilizing artificial intelligence in tax procedures AI-powered artifact as a knowledge creation instrument Vaasa 2022 School of Technology and Innovations Master’s thesis in Information Systems Science Master’s Programme in Digital Business Development 2 UNIVERSITY OF VAASA School of Technology and Innovations Author: Karri Koivula Title of the Thesis: Discovering the potential of utilizing artificial intelligence in tax procedures: AI-powered artifact as a knowledge creation instrument Degree: Master of Science in Economics and Business Administration Programme: Digital Business Development Supervisor: Ahm Shamsuzzoha Year: 2022 Pages: 83 ABSTRACT: Artificial intelligence, machine learning, and deep learning have become ubiquitous concepts. Interest in their utilization opportunities in many sectors has exponentially grown during recent decades partly due to the exponential growth of computer power and the increased availability of data, allowing for more powerful and sophisticated information technology solutions. Tech- nological maturity has lowered the threshold, and various open-source libraries and active com- munities enable the utilization of algorithms such as neural networks in practice. This thesis set out to find whether deep learning algorithms could be utilized in a value-adding way in the pro- cedure for limited liability companies responsible for handling tax claims in the case organization the Finnish Tax Administration. Additionally, the creation and deployment of artificial intelli- gence solutions should consider legal and ethical manners as restrictive key concerns. The research was carried out according to the action design research method in which the focus of the research is concurrently building a suitable artifact for the organization and learning (de- sign principles) from the creation and intervention itself. The research method was chosen due to its inclination towards authenticity in the organization and organizational centricity. As a re- sult, the project team consisting of three members created two functional artifacts: one based on neural networks and another based on self-organizing maps. The case organization provided data fueling the deep learning algorithms. Data consisted of financial information of anonymous limited liability companies in Finland. The artifacts were limited to function only as knowledge creation instruments due to legal and ethical limitations present in the context. Knowledge cre- ation in this research context refers to the artifact's ability to identify customers not returning (defaulting) their income tax returns from others. The created artifacts functioned sufficiently, and their ability to identify defaulting customers from others was promising. Results suggest that it is recommendable to approach problems with more than one artifact solution, and focused roles in the project team are recommended. Arti- ficial intelligence-based artifacts are seen as value-adding since the knowledge created by them can potentially save time, liberate resources and expedite processes. However, finalized artifacts were not created, and testing was limited to a simulated environment. The design principles that emerged from the artifact creation focused on addressing the legal and ethical challenges associated with artificial intelligence in taxation to secure sustainable artifact creation and us- age. Design principles were divided into three levels: trustworthiness through accuracy, legal and ethical restrictions and limitations of use, and justification of use. An organization-defined performance threshold needs to be reached by an artifact. An artifact must be transparent and regulated to fulfill context-specified legal and ethical limitations. Lastly, a preliminary inspection of artificial intelligence usage in a case organization is required. Consequently, the preliminary results of this research should be validated by applying the concept in a case organization, fol- lowed by an analysis of the results in an end-user setting. KEYWORDS: Machine learning, neural networks, self-organizing maps, tax procedures 3 VAASAN YLIOPISTO Tekniikan ja innovaatiojohtamisen yksikkö Tekijä: Karri Koivula Tutkielman nimi: Discovering the potential of utilizing artificial intelligence in tax pro- cedures: AI-powered artifact as a knowledge creation instrument Tutkinto: Kauppatieden maisteri Oppiaine: Digitaalinen liiketoiminnan kehittäminen Työn ohjaaja: Ahm Shamsuzzoha Valmistumisvuosi: 2022 Sivumäärä: 83 TIIVISTELMÄ: Tekoäly, koneoppiminen ja syväoppiminen ovat muodostuneet kaikkialla läsnäoleviksi käsit- teiksi. Kiinnostus niiden hyödyntämispotentiaaliin monilla toimialoilla on kasvanut viimeisten vuosikymmenten aikana. Laskentatehon ja saatavilla olevan tiedon eksponentiaalinen kasvu mahdollistavat tehokkaampien ja monimutkaisempien ratkaisujen luomisen. Teknologian ma- turiteetin kypsyminen on laskenut kynnystä ja avoimet ohjelmistokirjastot sekä aktiiviset yhteisöt mahdollistavat neuroverkkojen kaltaisten algoritmien hyödyntämisen käytännössä. Tämän opinnäytetyön tarkoitus oli tutkia tuottaako syväoppimisalgoritmien hyödyntäminen lisäarvoa osakeyhtiöiden verotuksen oikaisumenettelyssä Verohallinnossa. Lainmukaisten ja eettisten tekoälysovellusten luominen ja käyttöönotto tunnistettiin rajoittavaksi ja keskeiseksi tekijäksi. Tutkimus toteutettiin toiminnan suunnittelututkimuksen mukaisesti, jossa on tarkoitus sama- naikaisesti luoda kohdeorganisaation soveltuva artefakti sekä oppia (suunnitteluperiaatteet) ar- tefaktin luomisesta ja interventiosta organisaatioon. Tutkimusmenetelmä valittiin sen organ- isaatiokeskeisyyden ja organisaatiokohtaisen aitouden vuoksi. Tutkimusmenetelmän soveltami- sen seurauksena kolmehenkinen projektiryhmä loi kaksi toimivaa artefaktia, joista toinen pohjautui neuroverkkoihin ja toinen itseohjautuviin karttoihin. Kohdeorganisaatio toimitti syv- äoppimisalgoritmien tarvitseman datan. Data koostui tunnistamattomien suomalaisten osakeyhtiöiden taloustiedoista. Artefaktit oli rajattu toimimaan ainoastaan nk. tietoa tuottavina työkaluina johtuen lain ja etiikan rajoitteista. Tiedon tuottamisella tutkimuskontekstissa vii- tataan artefaktin kykyyn tunnistaa asiakkaita, jotka eivät täytä niiden tuloverotuksen veroilmoi- tusvelvollisuutta. Luodut artefaktit toimivat riittävällä tasolla. Niiden kyky tunnistaa haluttua asiakasryhmää oli lupaava. Tulosten perusteella on suositeltavaa lähestyä ongelmia luomalla useita erilaisia tekoälysovellutuksia. Lisäksi suositellaan kiinnittämään huomiota keskitettyihin rooleihin pro- jektiryhmässä. Tekoälypohjaiset artefaktit nähdään lisäarvoa tuottavina. Niiden tuottaman tie- don perusteella on mahdollista säästää aikaa, vapauttaa resursseja ja nopeuttaa prosesseja. Viimeisteltyjä ja organisaatioon vapautettuja artefakteja ei luotu. Artefaktien luonnin ja tes- tauksen perusteella syntyneet suunnitteluperiaatteet keskittyivät vastaamaan lain ja eettisyy- den asettamiin rajoitteisiin, jotka liittyvät tekoälyn hyödyntämiseen verotuksessa. Näin on mah- dollista varmistaa kestävä tapa luoda artefakteja ja ottaa niitä käyttöön. Suunnitteluperiaatteet jaettiin kolmeen tasoon: luottamus tarkkuuden kautta, lain ja eettisyyden luomat rajoitteet käytössä ja tekoälyn käytön perustelu. Artefaktin tulee ylittää organisaatiokohtainen kynnys su- orituskyvylle. Artefaktin tulee olla läpinäkyvä ja säännelty, jotta se noudattaa kohdeympäristönsä rajoitteita. Ennakollinen tutkimus tekoälyn hyödyntämiskohteista organ- isaatiossa on kehoitettavaa. Tämän työn saavuttamat ennakolliset tulokset on suositeltavaa vah- vistaa kohdeorganisaatiossa, jota seuraa tulosten analysointi loppukäyttäjien keskuudessa. AVAINSANAT: Koneoppiminen, neuroverkot, itseorganisoituvat kartat, verotusmenettelyt 4 Contents 1 Introduction 8 1.1. Background and purpose of research 8 1.2. Research problem and goal 10 1.3. Limitations 11 1.4. Research structure 11 2 Theoretical framework 13 2.1. Artificial intelligence 13 2.1.1. Machine learning 14 2.1.2. Deep learning 18 2.1.3. Performance evaluation metrics 23 3 Limited liability company and its taxation procedure 25 3.1. Principles of LLC’s taxation 25 3.2. Tax assessment procedure for LLC 28 3.2.1. Automated taxation in Finland 29 3.2.2. Overview of monitoring in tax assessment 32 3.2.3. Ethical principles for AI in the Finnish Tax Administration 32 4 Methodology 34 4.1. Justification of methodology 34 4.2. Action design research 34 4.2.1. Problem formulation 37 4.2.2. Building, intervention, and evaluation 38 4.2.3. Reflection and learning 40 4.2.4. Formalization of learning 41 4.3. Data collected for the study 42 5 Developing a knowledge creation artifact 45 5.1. Problem formulation 45 5.2. Building, intervention, evaluation 48 5.2.1. Alpha 49 5 5.2.2. Beta 52 5.2.3. Gamma 58 5.3. Reflection and learning 63 5.4. Formalization of learning 64 6 Discussion 69 6.1. Conclusions and recommendations 69 6.2. Research evaluation and restrictions 70 6.3. Suggestions for future research 71 References 72 Appendices 78 Appendix 1. Alpha, code sample 78 Appendix 2. Beta, code sample 80 Appendix 3. Gamma, code sample 82 6 Figures Figure 1. Illustration of SOM 17 Figure 2. SL vs. UL. vs. RL 18 Figure 3. Feedforward neural network 20 Figure 4. Confusion matrix 24 Figure 5. Tax assessment procedure simplified 29 Figure 6. The ADR Method 36 Figure 7. IT-dominant BIE vs. Organization-dominant BIE 40 Figure 8. ADR Method including associated tasks 42 Figure 9. BIE viewpoints in the project. 47 Figure 10. Original IT-dominant BIE plan in the tax administration ADR project 48 Figure 11. Alpha, CM, 2017 test set 50 Figure 12. Alpha, CM, 2018 51 Figure 13. Alpha, CM, 2019 51 Figure 14. Beta with additional data, CM, 2017 test set 56 Figure 15. Beta with additional data, CM, 2018 56 Figure 16. Beta with additional data, CM, 2019 57 Figure 17. SOM, 2017 59 Figure 18. SOM, 2018 60 Figure 19. SOM, 2019 61 Figure 20. Actualized IT-dominant BIE cycles in the project 62 Tables Table 1. Evaluation metrics 24 Table 2. Compressed adaption of net worth calculation 27 Table 3. Ethical principles for AI 33 Table 4. Dataset description 43 Table 5. Alpha, performance, 2017 test set 50 Table 6. Alpha, performance, 2018 50 Table 7. Alpha, performance, 2019 51 Table 8. Project team’s test plan for the beta 53 7 Table 9. Beta, performance, 2017 test set 53 Table 10. Beta, performance, 2018 54 Table 11. Beta, performance, 2019 54 Table 12. Artifact’s performance increased from alpha to beta 54 Table 13. Beta with additional data, performance, 2017 test set 55 Table 14. Beta with additional data, performance, 2019 56 Table 15. Beta with additional data, performance, 2019 57 Table 16. Artifact performance comparison 57 Table 17. Datapoints, 2017 59 Table 18. Datapoints, 2018 60 Table 19. Datapoints, 2019 61 Table 20. A preliminary set of design principles 67 Table 21. Summary of the ADR process 68 Abbreviations ADR Action Design Research AI Artificial Intelligence ANN Artificial neural network AR Action research CM Confusion matrix DL Deep learning DR Design research FN False negative FP False positive KBS Knowledge-based system LLC Limited liability company LR Linear regression ML Machine learning MLP Multi-layer perceptron RL Reinforcement learning ROC Receiver Operating Characteristic SL Supervised learning SOM Self-organizing map TN True negative TP True positive UL Unsupervised learning 8 1 Introduction The first chapter presents the background of the study, including the motivation behind it and a description of the problem domain. The chapter introduces the research prob- lem, goal, objective, and questions. The chapter proceeds then to the study’s limitations and ends with the structure of the study. 1.1. Background and purpose of research The Finnish Tax Administration’s main tasks include but are not limited to carrying out taxation and related payments, tax control, and recovery of unpaid taxes. Tax administra- tion is accountable to the ministry of finance. (Tax administration act (2010/502) 1 § and 2 §) Tax control and how it should be done are not described explicitly in the law. Tax control’s main objectives include reducing the tax gap, combating the shadow economy preventatively and precisely, and collaborating with other authorities. Tax control is is- sued to everyone from the individual to the corporate level. (Finnish Tax Administration, 2019a) Long processing times in the claim for adjustment procedure in the tax administration were identified as a research opportunity. Claim for adjustment procedure refers to the processing of tax claims (Tax assessment act, chapter 5). This research focuses on a spe- cific group of customers within the claim for adjustment procedure: limited liability com- panies. Due to long processing times, claims for adjustment is viewed as a bottleneck procedure. Partially responsible for the long processing time is the amount of unnecessary cases created by corporate taxpayers not filling their tax obligations within a set time limit. By not returning a tax return, a tax year is estimated. Estimating a tax year means taxes are enforced based on an estimate and not on the company’s actual tax report. Reversing 9 the estimated tax years is time-consuming for the tax administration. The potential iden- tification of such examples would be beneficial as knowledge to help pre-emptively tackle similar cases in the future. Decreasing the number of estimated tax years is helpful for customers and the tax administration. This research aimed to create a knowledge- creating artifact based on AI that could help the Finnish Tax Administration decrease the number of estimated tax years pre-emptively and concurrently make a stronger sense of the potential usage possibility of such artifacts. Vast amounts of data available in the tax administration create a suitable environment for AI solutions to be developed. Infor- mation created by AI solutions can decrease the amount of manual work. Additionally, as the claim for adjustment procedure strictly follows laws and regulations, the artifact and its creation focus on recognizing examples that go through the procedure and not participating in the decision-making of any sort. Advancements in computer power, the exponential growth of available data, and the undeniable potential within artificial intelligence (AI) facilitated the idea for this thesis. AI solutions based on artificial neural networks (ANN) and self-organizing maps (SOM) were used to extract valuable information related to a bottleneck process in the tax ad- ministration. (Finnish Tax Administration, 2021a) This thesis aims to prove that AI could bring value to taxation and can extract useful information from the data. Applying for a claim for individual and corporate customers is a fundamental right. How- ever, many of the claims are tax returns that have not been returned in the first place. By filing the tax return significantly late (10 months after the accounting period has ended), the processing time is approximately 12 months due to several reasons, includ- ing manual processing, which is not optimal for the taxpayer or the tax administration. Two different AI solutions, IT artifacts, were built according to the action design research (ADR) research method. The artifacts were created to function as pre-emptive knowledge-creating instruments. Additionally, the ADR research method generated de- sign principles to make such artifacts in similar settings. 10 This thesis is interested in finding out if an AI solution could detect and recognize the underlying characteristics in corporate taxpayers that would explain the number of late filings. Reducing the number of late filings would benefit the taxpayers as they would receive their tax decisions more swiftly. The tax administration would improve the qual- ity and reduce the number of claims. ANN and SOM are applied to the available data to see how well it can differentiate a company that will file its tax return significantly late from one that does not. This thesis answers whether AI could add value to taxation and what challenges might be faced when utilizing AI solutions in taxation. Answers to these questions are provided based on the results achieved by the ADR project carried out in the case organization. 1.2. Research problem and goal The research problem is whether AI could add value to taxation in the given problem domain. The research goal is to answer the research problem of how AI could be appli- cable in taxation in the limited liability companies' (LLC) claim for adjustment procedure. The research objective states that a suitable IT artifact would be created with the case organization to address the research problem and reach the research goal. Performance and creation are afterward analyzed. The creation of the artifact follows the ADR research method. The researcher formed research questions before the creation of the IT artifact began. Research questions are defined as: 1. How can AI be deployed to the case organization in Finland to create value in its cur- rent taxation system? 11 The first research question revolves around what kind of value and information AI cre- ates in the case organization and how it can be achieved. Reasons and timing of the usage are considered and planned accordingly. 2. How can AI be deployed so that it does not violate rules and regulations? The second research question focuses on challenges that AI usage should consider, such as trustworthiness, legality, and ethical restrictions, and how they are addressed. 1.3. Limitations This research only focuses on utilizing NN and SOM as a pre-emptive data analysis tool in the Finnish Tax Administration. In the context of this research, artificial neural net- works (ANN) and SOM are part of deep learning (DL). Data used in this research is limited only to limited liability companies with a business source of income. The companies’ data is mainly taxation and accounting-related basic and business information. 1.4. Research structure The introduction chapter presents the research problem, limitations, structure of the thesis, and the core of the idea to the reader. The second chapter focuses on machine learning (ML) and DL, which form the primary theoretical groundwork for this thesis. In the third chapter, the reader is introduced to LLCs and their taxation procedure in general. The fourth chapter presents the research method, and the fifth chapter outlines the available data used in this thesis. In the sixth chapter, the reader learns how the utiliza- tion of ADR was implemented, how the artifacts performed, and what were the design principles that came out of the process. Moreover, the sixth chapter provides answers to the research questions. The seventh and final chapter discusses the research’s goal 12 and objectives, covering how this research study was evaluated and the related re- strictions. The final chapter also includes necessary suggestions for future research. 13 2 Theoretical framework This chapter consists of the main theoretical background related to AI and its main con- cepts of ML and DL. This chapter ends by presenting performance evaluation metrics used in assessing the performance of IT artifacts created in this study. 2.1. Artificial intelligence Akerkar (2019, p. 3-4) views AI as the replication of biological, analytical, and decision- making capabilities as the essence of artificial intelligence. AI is often defined as ”the science and engineering of imitating, extending and augmenting human intelligence through artificial means and techniques to make intelligent machines.” To be considered intelligent, a system should be able to learn in a changing environment (Alpaydin, 2014, p. 3). The “Dartmouth conference” in 1956 is considered the official start of AI as it marked the beginning of AI as a research field (Shi, 2011, p. 2). To some extent, the famous Turing test by Alan Turing even precedes the ”Darthmouth conference” by offering a view on how to identify an ”intelligent machine” (Rahman, 2020, p. 15-17; Shi, 2011, p. 2). Arti- ficial intelligence (AI) is divided into general and narrow categories. General AI refers to AI that can act in a “human intelligent way,” navigating different problem domains. The ability to adapt to ever-changing situations is referred to as general AI. Currently, there are no systems that can perform this way. Narrow AI is an application that can perform well in one or two things but cannot go beyond what it has not been designed for, such as an AI application used to detect tax avoidance. The AI of today is in the narrow cate- gory. (Akerkar, 2019, p. 3-4) Finlay (2018, p. 62) offers a general way of viewing AI problems by dividing them into two broad categories: simple and complex. Simple problems have a singular objective 14 that must be determined and is easier to quantify. According to Alpaydin (2014, p. 5), a classification task with two separating classes is an example of such a problem. Whether a person is admitted a bank loan or not is an example of a classification problem. Finlay (2018, p. 62) highlights that a complex problem has more than one objective. Problems that require multiple ML approaches combined, such as autonomous vehicles, are com- plex AI problems. Knowledge-based systems (KBS) are perhaps AI’s most successful practical branch (Akerkar, 2019, p. 4). According to Shi (2011, p. 25), “KBS includes expert systems, knowledge base systems, intelligent decision support systems..” KBS systems primarily consist of a knowledge base and an inference engine. The knowledge base includes facts, task-related specifics, and heuristic knowledge of the domain. Inference engine refers to various methods of deducing new information from the knowledge base (Shi, 2011, p. 25, 120; Benfer et al., 1991, p. 11). 2.1.1. Machine learning “ML is the systematic study of algorithms and systems that improve their knowledge or performance with experience” (Flach, 2012, p. 3). According to Finlay (2018, p. 12), the main ingredients that fuel most AI and ML applications include data input, data prepro- cessing, predictive models, decision rules, and output. Due to exponential growth in computational power and the availability of a vast amount of data (big data), learnings methods such as ML and DL have become more attractive (Alpaydin, 2014, p. 309). In ML, we are interested in discovering patterns and useful approximations from data (Alpaydin, 2014, p. 2). Data input can be almost anything from sensory inputs such as videos to filed online forms such as tax returns. Data preprocessing refers to turning data inputs into a computer-friendly format. (Finlay, 2018, p. 12) Jung (2018, p. 9) divides ML into three components: (1) Data, its features, and labels, (2) a model or hypothesis space, and (3) loss function. Data is viewed as a collection of data 15 points that contain information of any kind. The amount and quality of data are crucial for ML. Features are measurable properties of data (Mirjalili & Raschka, 2019, p. 9, 109). An essential part of ML is to figure out features that have the most significant effect on the performance of ML. Data is often but not always labeled. Labels refer to higher-level information, and like features, they characterize a data point. ML’s model (hypothesis space) is a restricted, computationally feasible map of label and feature space. The map is called either a predictor or classifier. The map is called a classifier for finite label spaces, and for continuous label space, the map is called a predictor. (Jung, 2018, p. 7) According to Jung (2018, p. 9), linear regression is (LR) a supervised machine learning method that uses linear maps for the hypothesis space. LR tries to find a map that could predict an accurate label of an output based on features of a data point. To acquire such a map, historical data is used to try out different options for the map and pick the best one. The purpose of the loss function is to measure the quality of a specific map. “Loss (approximation error) is the sum of losses over the individual instances.” (Alpaydin, 2014, p. 41-42). To determine the feasibility of the map, a measurement for the loss (or error) incurred needs to be specified. For the LR example involving numeric labels (regression problem), a commonly used choice for loss function is the squared error (R2) loss. (Jung, 2018, p. 26) ML requires data as its primary goal is to predict an outcome based on features (Giussani, 2020, p. 14). Received data almost always needs to be prepared before being used in ML (Lee, 2019. p. 107-117). A case-dependent number of iterations of training, validating, and testing before a finished ML algorithm can be utilized in action, which includes split- ting the dataset into testing and training sets, data feature-related selections, and di- mensionality reduction (Chebbi, 2018, p. 213; Campesato, 2020, p. 28-34). Knowledge representation refers to the fundamental goal of AI; the creation of such AI that is capable of intelligent behavior as determined by humans (Shi, 2011, p. 18). Ma- chine learning is a ”facet of AI that focuses on algorithms, allowing machines to learn 16 without being programmed and change when exposed to new data.” (Akerkar, 2019, p. 4). ML is seen as the most critical problem of AI (Shi, 2011, p. 18). According to Taiwo (2010, p. 4-5), ML is suitable for tasks that cannot be defined well except by example. ML can roughly be divided into supervised, unsupervised, and reinforcement learning. (Alpaydin, 2014, p. 9-13) 2.1.1.1. Supervised learning According to Campesato (2020, p. 19), in supervised learning (SL), learning is done by exposing the learner to the data, including the known outcomes. This way, the machine can improve its performance; thus, it knows the desired results and what to pursue. Learning in SL occurs during training. (Rahman, 2020, p. 20) According to Alpaydin (2014, p. 5), both regression and classification are supervised learning problems. 2.1.1.2. Unsupervised learning Unsupervised learning (UL) includes only the input as output data is missing or excluded (Alpaydin, 2014, p.11). Unsupervised learning includes unlabeled or data of unknown structure. It is used to deduce important information from data to find patterns. (Lee, 2019, p. 5) Unsupervised learning is suitable for finding regularities in the data and de- tecting naturally occurring groups, such as in the k-means clustering algorithm (Alpaydin, 2014, 11; Rahman, 2020, p. 21). Additionally, the self-organizing map (SOM) is a well-known UL method. SOM draws a topographical map of the data where similar observations are positioned closer, and an ordered representation of the data is created. As Kohonen (2013, p. 52-53) presented, figure 1 depicts the core idea of SOM: input data mapped out where Mc best represents X. Models (Mi) in the same circle are more similar to Mc than M to other observations on the map. 17 Figure 1. Illustration of SOM 2.1.1.3. Reinforcement learning According to Alpaydin (2014, p. 517), in reinforcement learning (RL), a “decision-making agent” is acting in an environment from which it receives feedback (reward or penalty) when trying to solve a task. Based on the feedback, the agent should be able to learn the best policy for acting in the environment. Learning the policy is at the center of RL. Individual action is determined well if it supports the longer-term goal, such as a chess move to win the game (Rahman 2020, p. 80). RL is commonly utilized in games due to its nature as a series of sequences or actions towards a goal. More than one agent is possible in tasks where concurrent action is re- quired. In cases of multiple agents, the agents communicate and cooperate to complete a task. (Dutta, 2018, p. 47-48) 18 Figure 2. SL vs. UL. vs. RL Figure 2 depicts the three main learning categories in ML and their central learning pol- icies. 2.1.2. Deep learning DL is viewed as a subfield of ML that utilizes multiple layered ANNs to solve problems (Mirjalili & Raschka, 2019, p. 383-384). Instead of analyzing data linearly, neural net- works enable machines to process data nonlinearly (Alpaydin, 2014, p. 306). At its core, DL divides the learning process into connected steps, also known as layers, that are as- signed to different sections of the main problem available to the whole network (Rah- man, 2020, p. 80-81). According to Kelleher and Tierney (2018, p. 242), the strength of DL models lies in their ability to utilize previously gathered knowledge from the previous layers to their advantage in the following layers, which is referred to as backpropagation. In backpropagation, previously accumulated feedback from events in the network is used in future calculations within the network (Rahman, 2020, p. 22). 19 2.1.2.1. Artificial neural networks An ANN mimics the human brain and its functions; hence the neural in the neural net- work refers to biological neurons in the brain (Graupe, 2013, p. 1). A neural network consists of layers that contain neurons that perform the required mathematical calcula- tions (Rahman, 2020, p. 20). The neurons in the layers together form a parallel and in- terconnected network as each of the layers and their neurons might connect (Rahman, 2020, p. 20; Alpaydin, 2014, p. 267). UL algorithm SOM is also considered a type of ANN (Kohonen, 2016, p. 724). According to Mirjalili & Raschka (2019, p. 83), the three prominent neural networks mod- els include (1) feedforward neural network (FNN), (2) recurrent neural network (RNN), and (3) convolutional neural network (CNN). In FNN, the connections in the network are only moving forward (Kelleher & Tierney 2018, p. 124). In RNN, the neurons are also connected backward, resulting in the network having a short-term memory of past inci- dents (Mirjalili & Raschka, 2019, p. 83-84; Alpaydin, 2014, p. 305). RNN is utilized when the network is required to know the information of the previous layers. In CNN, “the work of each hidden unit is considered to be a convolution of its input“ (Alpaydin, 2014, p. 294). Hidden units in CNN view the same input space from a different place, looking for additional features that are later intertwined into more useful information. CNN is used in visual recognition tasks. (Alpaydin, 2014, p. 295; Rahman, 2020, p. 84) 2.1.2.2. Multilayer perceptron The multilayer perceptron is a feedforward neural network model as all the connections move towards the output (Kelleher & Tierney 2018, p. 124). Perceptron refers to the ensemble of a neuron and its input connections and weights (Alpaydin, 2014, 271, 273). There are some constant parameters to consider when training a multilayer perceptron network, such as the number of hidden layers in the network as the increased number of hidden layers makes the network “deep“, activation function, or the calculation of a 20 neuron’s activation threshold, and the batch size or size of the data section is passed to the network in the training phase (Jung, 2018, p. 45). Additionally, epochs or the number of passings of the data through the network (Alpaydin, 2014, p. 285) and learning rate or how quickly the network optimizes itself (Mirjalili & Raschka, 2019, p. 200). Figure 3. Feedforward neural network In figure 3 (adapted from Kelleher & Tierney 2018, p. 124), there are three layers of neu- rons: (1) input layer, A and B, (2) hidden layer C, D, and E, and (3) output layer F. Neurons in a neural network are doing a set of operations: 1. Multiplying each input by a weight 2. Adding together the results of the multiplications 3. Pushing the result through an activation function 21 According to Kelleher and Tierney (2018, p. 121-136), “all the connections between the neurons in a neural network are directed and have a weight associated with them.” The weight applied to an input that a neuron receives is the weight on the connection coming to the neuron when the multi-input regression function over its inputs is calculated. As seen in figure 3, the flow of information in the network between the neurons is pre- sented by arrows. The neural network in figure 3 is considered fully connected because each neuron is connected to all the neurons in the subsequent layer. The tags in the arrows reveal the weight that the neuron at the end of the arrow applies to the infor- mation passing through the connection. In figure 3, the calculation performed by neuron F of the network can be defined as: 𝑂𝑢𝑡𝑝𝑢𝑡 = 𝜑(ω𝐶,𝐹C + ω𝐷,𝐹D + ω𝐸,𝐹E) * 𝜑 = activation function ** ω = weight applied to the neuron 2.1.2.3. Predicting future events in taxation using DL models Tax officials in Finland have already utilized and expressed a growing interest in AI usage in taxation (Finnish Tax Administration, 2021b). Chen et al. (2011) developed an auto- matic detection model for discovering erroneous tax reports in their study. The study was motivated by the criticality of tax reporting and the large number of errors found in reports in recent years. Detecting erroneous tax reports is tedious and depends on ex- perienced personnel. Therefore, the need for an automatic solution exists to reduce the workforce needed for the job. The model in the study by Chen et al. (2011) was carried out with various NN methods compared to each other. The different approaches were ”multi-layer perceptrons, learning vector quantization, decision tree, and hyper- rectangular composite neural networks methods.” Data consisted of construction com- panies residing in Taiwan. No matter which NN approach was used, the correct recogni- tion rate reached nearly 80 %. The best performing approach, hyper-rectangular 22 composite neural network, was able to digest almost 250 valuable rules for identifying erroneous tax reports from the data. Studies by Xiangyu et al. (2018) and Pérez López et al. (2019) focused on tax evasion. Xiangyu et al. (2018) developed a neural network model to tackle the issue of tax evasion in automobile sales enterprises in China. The NN-based recognition model’s object was to determine behavior related to tax evasion. Pérez López et al. (2019) utilized in their research an MLP neural network model to identify tax fraud concerning personal income tax returns in Spain. The result in both cases was a success. Xiangyu et al. (2018) reached a recognition accuracy of 89 %. The result was assessed with Receiver Operating Charac- teristic (ROC) curve, which showed that the classification effect was good. Pérez López et al. (2019) achieved an efficiency rate of 84.3%. Moreover, the NN by Pérez López et al. (2019) offered information on the probability of each taxpayer’s inclination to evade taxes. MLP is beneficial for classifying fraudu- lent/non-fraudulent taxpayers based on the results. The robustness of the model was confirmed with the ROC curve, which verified the NN’s high predictive capacity. A study by Rahimikia et al. (2017) focused on tax evasion with a more complex approach. In their study, Rahimikia et al. (2017) created a novel hybrid intelligent system to detect corporate tax evasion in Iran. Hybridity came from combining NN, SVM, and LR classifi- cation models with harmony search (HS) optimization algorithm, which is inspired by the improvisation process of musicians. The system was tested in the food and textile sectors. Researchers concluded that the system could accurately detect hidden patterns in tax returns that could point toward tax evasion. The results offer valuable, sector-wise infor- mation about the financial structure of tax evasion. The hybrid system is seen as a useful tool to detect tax evaders and an identifier of patterns suggesting tax evasion. Additionally, tax officials have utilized NNs in social media. Zhang et al. (2020) developed a proof-of-concept NN to identify transaction-based tax-evading activities in the hidden 23 economy of social media. Dataset consisted of ”Instagram posts about #lipstick and man- ually annotated sampled posts with multiple labels related to sales and tax evasion ac- tivities.” The purpose of the NN detection model was to identify suspicious social media posts. The posts deemed more suspicious by NN were afterward analyzed by tax officials. As the NN model identifies the suspicious posts, first, the productivity of manual work is improved from 22 percent to 72 percent. The NN model improves manual labor effi- ciency as the tax officers will not have to select the posts randomly. 2.1.3. Performance evaluation metrics Several ways exist to measure the performance of a neural network model. The perfor- mance of the neural network created in this thesis is evaluated with the help of accuracy, precision, recall, and f1-score, all derived from the confusion matrix. The confusion matrix (CM) presents the performance of a learning algorithm. CM is a square matrix that reports the count of true positive (TP), true negative (TN), false posi- tive (FP), and false negative (FN), as presented in figure 4 (adapted from Rokach (2009, p. 160-161)). Table 1 illustrates how they are calculated. (Adapted from Giussani, 2020, p. 62-64). 24 Figure 4. Confusion matrix Table 1. Evaluation metrics Metric Calculation Definition Accuracy TP + TN FP + FN + TP + TN Sum of correct predictions divided by all predictions. Precision TP TP + FP Correctness of the model per class. Useful in imbal- anced class problems. Recall TP FN + TP The number of correctly evaluated instances per class is divided by all the correct examples in the class. Use- ful in imbalanced class problems. F1-score 2 x Precision x Recall Precision + Recall A balanced combination of precision and recall. 25 3 Limited liability company and its taxation procedure This chapter maps out an overview of what an LLC is and what is expected from them in terms of accounting and taxation and an overview of the taxation procedure of LLCs in Finland, focusing on automation, monitoring, and ethical principles for AI in the Finnish Tax Administration. Legislative changes concerning LLCs that came into force in 2020 or after are not considered as the data used in this study is from tax years 2017, 2018, and 2019. 3.1. Principles of LLC’s taxation According to the limited liability companies act (2006/624), chapter 1, 1 §, subsection 1, LLC is a separate taxpayer from its owners created through registration to the trade reg- ister. By registering to the Finnish trade register LLC becomes a discrete taxpayer. Stock- holders are not personally responsible for LLC’s liabilities (Limited liability companies act chapter 1, 2 §, subsection 2). According to the act on bookkeeping (1997/1336) chapter 1, 1 § subsection 1 paragraph 1 LLCs are accounting obligated for each accounting period. LLCs’ are obliged to compose a financial statement, including a balance sheet, profit/loss (P/L) statement, and annual report for each accounting period, according to the act on bookkeeping and limited lia- bility companies act (Limited liability companies act chapter 8, 3 §). The accounting pe- riod is 12 months except for the beginning and end of the business and in cases of alter- ations to the accounting period (Act on bookkeeping chapter 1, 1 §, subsection 1). The balance sheet dictates the corporation’s financial status: the relationship between assets and liabilities. P/L statement presents how the accounting period’s outcome came to be. In some cases, LLCs’ are required to submit separate funds statement which dictates how funds were acquired and utilized (Act on bookkeeping chapter 3, 1 §, subsection 1, par- agraph 1, 2, and 3). 26 Larger corporations such as public LLCs are expected to return an annual report which is a written report of a company’s status on development and profitability, financial situa- tion, and most significant risks (Act on bookkeeping chapter 3, 1 §, subsection 3). A fi- nancial statement is required to present a genuine, essential, and sufficient picture of the profitability and economic status of the company (Act on bookkeeping chapter 3, 1 §, subsection 1). Corporations’ tax obligations are based on the financial statement cre- ated according to the act on bookkeeping (Act on bookkeeping chapter 1, 1 a §, subsec- tion 2). According to the income tax act (1992/1535) section 1, subsection 4, corporations in Finland are tax liable. Corporations consist of governments, municipalities, congrega- tions, limited liability companies (LLC), and foreign estates (Income tax act 3 §, subsec- tion 1). Monetary benefits received by the tax liable are taxable income. The tax liable has the right to deduct expenses related to the acquirement and retaining of the benefits. Therefore, profit or loss is determined by subtracting tax-deductible costs from taxable income (Income tax act 29 §, subsection 1). Corporations' taxable income is calculated separately for each source of income (Income tax act 30 §, subsection 4). Before 2020, three different income sources exist for LLCs: personal, business, and agri- culture. Personal source income is taxed according to the income tax act. Taxation of business source income is done according to the act on the taxation of business income (1968/360). Agricultural source income is carried out according to the act on agricultural income tax (1967/543). The tax rate percentage for corporations is 20 (Income tax act 124 §, subsection 2). Con- firmed losses are deducted in the order they have occurred (Income tax act 117 §, sub- section 2). Losses from business activities are deductible from taxable income for ten subsequent years. Losses belonging to the same source of income from the previous tax 27 years are deducted from the current fiscal year’s taxable profit in the same source of income (Income tax act 119 §, subsection 1). According to the act on assessment of assets in taxation (2005/1142) 2 §, subsection 1, net worth (positive/negative) for a non-public LLC is calculated by subtracting the total amount of liabilities from the total amount of assets. LLC’s assets consist of fixed assets and other non-current investments, current assets, financial assets, and other assets with monetary value. Liabilities include borrowed capital in the balance sheet. (Act on assessment of assets in taxation 2 §, sub-section 2 and 3) The assets in 2017, 2018, and 2019 tax returns consisted of the following fixed assets and other non-current investments, current assets, financial assets, and other long-term investments. The liabilities consist of current and non-current liabilities. Net worth is cal- culated by subtracting liabilities from assets as presented in adapted table 2. Capital, equity, and reserves are presented as recorded in accounting. (Finnish Tax Administra- tion, 2021c) Table 2. Compressed adaption of net worth calculation 1 ASSETS 2 LIABILITIES Fixed assets and other non-current investment Current liabilities Current assets Non-current liabilities Financial assets Other long-term investments (Income Tax Act) ASSETS TOTAL LIABILITIES TOTAL NET WORTH - POSITIVE NET WORTH - NEGATIVE 3 CAPITAL, EQUITY, AND RESERVES Restricted equity Unrestricted equity CAPITAL, EQUITY, AND RESERVES TOTAL 28 3.2. Tax assessment procedure for LLC Act on tax assessment (1995/1558) is a 12 chapter and 96 section law on taxation pro- cedures and claims for adjustment on income taxation. It dictates taxpayers reporting responsibilities and procedures of tax assessment. Principles of tax assessment proce- dures are mainly based on the fourth paragraph of the tax assessment act and claims for adjustment in the fifth paragraph. The Finnish Tax Administration carries out taxation in Finland. Taxation will be carried out based on taxpayers’ reporting and reports received from external parties. (Tax as- sessment act 6 § and 26 §) According to the act on administrative procedures (2003/434) 1 § purpose of the law is to execute and advance good administration and due process in administrative proce- dures. The purpose of the law is also to advance the quality and productivity of admin- istrative services, such as the tax assessment procedure and carrying out taxation. LLCs’ are obligated to give their reports (tax return) four months after their accounting period has ended. An LLC which is neglecting its obligation results in taxation being esti- mated by the tax administration. Tax administration must send a hearing to the taxpayer of the estimation to do this. Taxation will be estimated if the reporting obligation is not fulfilled within the time reserved in the hearing. (Tax assessment act 7 §, 8 §, and 27 §; Tax administration’s decision on reporting duties and notes (A123/200/2016)) For corporations, taxation ends at the latest ten months after the end of their tax year closing month (= end of the accounting period). If the taxpayer has not filed their tax return within ten months, the tax decision will be based on the estimate made by the tax administration. To adjust a closed tax year, an adjustment claim is required. Pro- cessing time for a closed tax year is 12 months. (Tax assessment act 49 § and 61 §; Finnish Tax Administration, 2021a) Figure 5 demonstrates how the taxation procedure occurs for an LLC. 29 Figure 5. Tax assessment procedure simplified The principles of the tax assessment procedure are a guide based on the tax assessment act and administrative procedures. It maps out the basic principles of how taxation is carried out in Finland. The principles of taxation procedures apply to all taxpayers in gen- eral. Taxation procedures are primarily based on the tax assessment act. However, ad- ministrative procedures can supplement the Tax assessment act if not otherwise stated. (Finnish Tax Administration, 2015) 3.2.1. Automated taxation in Finland Tax administration is responsible for the tax assessment procedure. As taxation is mainly based on the reporting by taxpayers, it is left to the tax administration to ensure that 30 taxation is carried out correctly. Reports subject to manual control are selected based on specific rules. (Finnish Tax Administration, 2020) The amount of tax reports is vast. With over 15 million tax-related decisions, not every- thing is manually revised. Automated decision-making is necessary due to the immense amount of tax-related work. Many tax-related procedures are carried out automatically without revising by a tax official. (Finnish Tax Administration, 2020) Automation is directed to undisputed matters which are not selected to manual control and could be solved without consideration. Cases not selected for manual control are formal. Tax decisions made by automation are not explicit and binding decisions made by the tax officials. If an error occurs, it can be corrected afterward by the taxpayer and the tax administration. Tax administration is not utilizing artificial intelligence in tasks that require consideration and decision-making. That is left solely for tax officials. (Finn- ish Tax Administration, 2020) All the assessments and procedures made by the tax administration are based on law. The automated decision-making is used when it is possible to program a set of rules based on legislation. The algorithms or logic behind automation have not been strictly defined in legislation. However, automated solutions can only be used in situations that have been mentioned explicitly in the law or procedures based on law. AI, statistics, or scientific models are not used in automation. (Finnish tax administration, 2020) Safeguard measures to ensure that taxation does not violate the fundamental rights of taxpayers are taken. Measures strictly and only directed to automated taxation do not exist. However, the safeguard measures, in general, apply also to automated taxations. Safeguard measures are the following: 1. Taxpayers are heard before decision-making that requires consideration. 2. Taxpayers have the right to submit a claim that a tax official processes. 31 3. Right to know when taxation has been assessed automatically. 4. Right to know what is automatically assessed. 5. Right to expect that taxation is processed by a tax official when additional report- ing is submitted. (Finnish Tax Administration, 2020) Automation has been used in taxation in Finland since 2005 since it has increased the performance of tax-related processing. In 2019 deputy ombudsman of the Finnish Par- liament released a decision relating to automation in taxation and its relationship with taxpayers’ due process, good administration, and tax officials’ liabilities due to several complaints by taxpayers who had problems with automated taxation. (Finnish Tax Ad- ministration, 2020; Finnish Parliament Ombudsman, 2019) According to the deputy ombudsman, automated taxation is not based on appropriate and precise legislation which considers good administration, due process, and the tax official’s liabilities; it is therefore against the fundamental law. Immediate investigation of regulative needs is expected. (Finnish Parliament Ombudsman, 2019) The deputy ombudsman pointed out that legislation behind automated taxation applies only to tax assessment, not tax-related decision-making. Taxation is mainly automated, and many phases from hearing to decision-making, occur in automation without any tax officials taking part in the process. Additionally, the automated process and the algo- rithms behind it are not transparent. For this to actualize, precisely defined legislation is required. Taxpayers should be openly notified when their taxation has been automati- cally assessed and how, so they can evaluate if it has been done correctly and their due process is respected. (Finnish Parliament Ombudsman, 2019) 32 3.2.2. Overview of monitoring in tax assessment The strategic goals of the Finnish Tax Administration are securing tax funding, righteously carrying out taxation and positive customer service. The automation rate in taxation has increased significantly through the years. New technologies have allowed citizens to ful- fill their responsibilities and for the tax administration to validate, assess, and control the flow of reports, payments, and information more efficiently (Finnish Tax Administra- tion, 2019a) The more routine tasks such as copying information and comparing data have already been partly given to robots. Robots in this context refer to robotic process automation (RPA), suitable for routine tasks and are based on explicit rules in digital form. They have the characteristics of assembly-line work. (Finnish Tax Administration, 2019a) 3.2.3. Ethical principles for AI in the Finnish Tax Administration In 2018 the Finnish Tax Administration released its ethical principles for AI. The Finnish Tax Administration joins in the pursuit of attaining ethical and trustworthy AI. According to the Finnish Tax Administration, ethical principles will be considered in all decision- making considering AI. (Finnish Tax Administration, 2019c) The principles of the tax assessment procedure are based on two laws: the legislation on tax assessment and administrative procedures. The ethical principles for AI are based on principles of the tax assessment procedure. The ethical principles of AI consist of four main principles, as presented in table 3. 33 Table 3. Ethical principles for AI P1 Reliable data. How AI solutions function is known, and detailed knowledge of their operating principles has been acquired. AI will not be given access to data until it is certain that the data is reliable and suitable for its purpose. Reliability and suitability will be actively monitored even when data is used. Inaccurate data and algorithms will undergo necessary corrections promptly. (Finnish Tax Administration, 2019c) P2 A human is always re- sponsible. AI can be taught by a human, or it can learn by itself though it is continuously monitored by a human. Suggestions created by AI can always be changed. AI can only proceed in the de- cision process if it can be traced and justified afterward. A facet in charge of AI has been named. (Finnish Tax Admin- istration, 2019c) P3 AI follows laws and regulations. Usage is monitored and evaluated. Immediate action will be taken in case of divergence. AI will not endanger taxpayers’ tax data security or confidentiality. The use of AI doesn’t jeopardize the legal protection of the taxpayer or the person responsible for AI’s decision. AI partners are selected with care, and tax administration takes full responsibility for the operations of the entire supply chain. AI solutions undergo the same safety principles as all other IT systems utilized in the Finnish Tax Administration. (Finnish Tax Administration, 2019c) P4 Tax administration takes part in public discussion on respon- sible and ethical AI ap- plications. Adoption of ethically sustainable AI technologies and inter- national procedures is promoted. The tax administration also influences changes in legislation. Tasks in which AI is utilized will be openly communicated to the public. (Finnish Tax Ad- ministration, 2019c) 34 4 Methodology This chapter consists of the reasoning behind the selection of the ADR research method, the theoretical framework of ADR, and the data collected for the study. 4.1. Justification of methodology The researcher chose ADR by Sein et al. (2011) as a research method for the project due to its flexibility, authenticity, and organization centricity. ADR team in the project con- sisted of three professionals working on different tasks within the case organization, in- cluding the researcher. Several meetings and altogether three development cycles oc- curred during the project to create a suitable IT artifact that could assist the procedure by offering new insights. The problem could be classified in the simple category described by Finlay (2018, p. 62) and Alpaydin (2014, p. 5). ADR outcomes consist of an IT artifact for the organization and design principles through generalized outcomes for the scientific community (Sein et al., 2011, p. 44). As a result of utilizing ADR in the case organization's restricted and authen- tic setting, the project concluded by creating two different IT artifact solution concepts: NN as an SL type of solution and SOM-based algorithm as a UL type of solution. Moreover, preliminary design principles emerged from the ADR process. 4.2. Action design research This study adopted action design research (ADR), which is a research method with its research process focused on building innovative IT artifacts in their organizational set- tings while simultaneously learning from the intervention and assessing it concurrently (Sein et al., 2011, p. 37-38). ADR is seen as applicable when the establishment of an” in- 35 depth understanding of the artifact–context relationship is needed to develop a socio- technical design agenda for a specific class of problems.” (Sein et al., 2011, p. 52-53) ADR combines ideas from two known research methods: action research (AR) and design research (DR) (Tiainen et al., 2015, p. 19). In AR, the researcher aims to solve practical issues or improvement demands by intervening (changing practice) in an organization. The intervention is done closely with the organization. Results from AR benefit both the organization in the form of problem-solving and the scientific community in the form of new practical knowledge on the subject matter. (Tiainen et al., 2015, p. 2) In DR, the researcher strives to create an IT artifact to solve a problem within the organization (Tiainen et al., 2015, p. 3). However, in DR, as opposed to AR, the researcher is solely responsible for the IT artifact, and organizational context and collaboration do not play a significant role in creating the artifact (Sein et al., 2011, p. 38). ADR was proposed as a new research method by Sein et al. to address the need for a research method that would recognize more profoundly the organizational context and its effects on the IT artifact. In ADR, the research focuses on creating a suitable IT artifact for the organization as opposed to AR, in which the focus is on making changes to the organization and its activities. (Sein et al., 2011, p. 38-40) However, the precise definition of an IT artifact is still a matter of dispute due to inconsistencies in the term's usage. (Alter, 2015, p. 48-50; Sein et al., 2011, p. 38) Sein et al. (2011, p. 38-39) view an IT artifact as an ensemble where the organizational domain is structurally engraved into the artifact during development and later usage. Consequently, in ADR, the IT artifact is viewed as an ensemble emerging from the inter- section of development and intent of the researcher, contextual factors, refinement, and usage, as well as its influences on the IT artifact. The ADR method deals with two challenges: (1) addressing a context-specified problem by intervening and evaluating; and (2) creating and evaluating an “IT artifact that 36 addresses the class of problems typified by the encountered situation.” To fill the re- quirements of both challenges, the method focuses on building, intervention, and eval- uation of the created artifact. The created artifact “reflects on theory, the intent of the researchers, the influence of users and ongoing use in context.” (Sein et al., 2011, p. 40) As the nature of the artifact is an ensemble, it must be stated that ADR deals with the following critical issues: • Evaluation and building of the artifact are done in cycles and not in sequences as in DR. • Evaluation of the artifacts should occur naturally and whenever possible, as con- trolled assessment is challenging to design and conduct. • “Innovation must be defined for the class of systems typified by the ensemble artifact.” (Sein et al., 2011, p. 43-44) The ADR methods stages and principles addressing the aforementioned issues are seen in figure 6 (adapted from Sein et al., 2011, p. 41). Figure 6. The ADR Method 37 4.2.1. Problem formulation The ADR method is triggered by a practical problem or the researcher’s initiative. The preliminary empirical investigation aims to “identify and conceptualize a research op- portunity based on existing theories and technologies.” In addition, the scope, roles in the research, practitioner’s participation in the problem solving, and initial research questions are formed in stage 1. (Sein et al., 2011, p. 40) Two critical elements are identified in the first stage of ADR: ensuring long-term commit- ment from the participating organization and defining the problem as a class of problems. The anchoring principles in stage 1: • Principle 1: Practice-Inspired Research. • Principle 2: Theory-Ingrained Artifact. (Sein et al., 2011, p. 40) Principle 1 views practical problems as “knowledge creation opportunities” at the organ- izational domain and technology intersection. Action design researcher is expected to generate knowledge that applies to a class of problems exemplified by the organization’s problem. Therefore, the ADR team is not expected to solve the practical problem in the organization rigorously but to merely “intervene within the organizational context of the problem.” (Sein et al., 2011, p. 40) Principle 2 requires that the artifact is based on theory. The initially designed artifact should be found on the generalized theory where the researcher is inscribing theoretical elements into it. Afterward, the artifact is subjected to “cycles of intervention, evaluation and reshaping” in the organizational context. (Sein et al., 2011, p. 40-41) 38 4.2.2. Building, intervention, and evaluation Stage 2 is based on the framed problem and theoretical premises from stage 1, which works as a groundwork for the initial design of the IT artifact. Subsequent development phases take place in stage 2. (Sein et al., 2011, p. 41) Artifact building, organizational intervention, and evaluation (BIE) are interwovenly car- ried out as an iterative process in the organizational context. Constant assessment of the problem and the artifact and the articulation of design principles occur in the BIE stage. The result of the BIE stage is the realized artifact. The BIE stage also dictates the where- abouts of the innovation: innovation from the design of the artifact or the organizational intervention. (Sein et al., 2011, p. 41-42) Stage 2 has two endpoints for the research design continuum: IT-dominant BIE and or- ganization-dominant BIE. At one end, the IT-dominant BIE focuses on creating an inno- vative technological design. The more mature version of the artifact (beta version) is in- troduced in the organizational setting. Subsequently, a new BIE cycle is started, or the researcher exits the project. (Sein et al., 2011, p. 42) On the other end is the organization-dominant BIE most convenient for generating de- sign knowledge primarily from the intervention. In the organization-dominant BIE, the ADR team challenges existing ideas and assumptions about the artifact’s usage in the context. In this BIE, the artifact is introduced in the organizational setting in an earlier phase (alpha version). (Sein et al., 2011, p. 43) Stage 2 has three principles with emphasis on the inseparability of the influencing do- mains to the artifact: • Principle 3: Reciprocal Shaping. • Principle 4: Mutually Influential Roles. • Principle 5: Authentic and Concurrent Evaluation. (Sein et al., 2011, p. 43) 39 Principle 3 underlines the strong inseparable influences from the IT artifact and the or- ganizational context. Recursive cycles make it possible to gain an increased understand- ing of the organizational context and, therefore, adjust its interpretation and change the chosen design constructs if needed. (Sein et al., 2011, p. 43) Principle 4 focuses on mutual learning. While researchers offer theoretical knowledge, the practitioners offer insights into organizational presumptions and policies. Contribu- tions from different participants might compete with or complement one another. Indi- vidual participants could have multiple roles. However, clarity of assignment responsibil- ities is worth pursuing the sake of the research experience. (Sein et al., 2011, p. 43) Principle 5 accentuates that evaluation is an ongoing and interwoven part of the research process and not a subsequent stage. Evaluation cycles for beta and alpha versions differ from one another. “Evaluation cycles for the alpha version are formative, contributing to the refinement of the artifact and surfacing anticipated and unanticipated conse- quences.” As opposed to the evaluation for the more mature beta version assesses value and utility outcomes. Authentic evaluation is seen as more prolific than the hard to en- gineer controlled evaluation. (Sein et al., 2011, p. 44) The main differences between IT-dominant BIE and organization dominant BIE are show- cased in figure 7 (Adapted from Sein et al., 2011, p. 42-43). 40 Figure 7. IT-dominant BIE vs. Organization-dominant BIE 4.2.3. Reflection and learning In stage 3, the research moves from solution building to a specific instance into an ex- tensive class of problems. Stage 3 is a “continuous stage and parallels the first two stages as depicted” in figure 6. (Sein et al., 2011, p. 44) 41 Stage 3 reflects the research process and sees it as more than simple problem-solving. To ensure that knowledge is genuinely identified, conscious reflection on the theories, problem framing, and the emerging ensemble is required. Based on early evaluation re- sults, adjustments might be necessary to understand the artifact research process better. (Sein et al., 2011, p. 44) The only principle in stage 3 is guided emergence (Principle 6). Contradicting terms em- phasize the dynamic reflection of the ensemble artifact on the initial design, current de- velopment in the organizational context, and simultaneous outcomes from authentic evaluation. Rising sensitive signals implying the need for trivial and substantial refine- ments are expected to be dealt with in a project. (Sein et al., 2011, p. 44) 4.2.4. Formalization of learning According to Sein et al. (2011, p. 44), the fourth stage is focused on formalizing the learn- ing. “Learning from problem-specific solutions should be transformed into solution con- cepts for a class of field problems.” The outcomes can be turned into design principles refining theories that influenced the initial design. Stage 4 draws from the generalized outcomes principle (Principle 7). Due to the situated nature of ADR, generalization is seen as challenging to achieve. The ensemble artifact “represents a solution that ad- dresses a problem. “ In ADR, the transition “from specific-and-unique to generic-and- abstract” is critical. A three-level conceptual move to address this is suggested: 1. Problem instance generalization 2. Solution instance generalization 3. Derivation of design principles Principle 7 deals with casting known factors as instances of their classes. The problem, the solution, and design principles (knowledge capturing) are cast into their respective classes. Finally, through derivation, it is possible to connect generalized outcomes, the 42 design principles “to a class of solutions and a class of problems.” (Sein et al., 2011, p. 45) Figure 8 (based on figure 1 by Sein et al., 2011. p. 41) is an adapted and modified figure that presents the ADR as a concept, including tasks related to each phase. Figure 8. ADR Method including associated tasks 4.3. Data collected for the study The study data collected consisted of private limited liability companies in Finland. Of those, a specific form of a customer group was chosen as the focus of the artifact: LLCs with estimated tax years. In this study, financial data/information was considered from the years 2017 to 2019 of the companies conducting solely business activities under the act on business income only. The companies with personal and agricultural income 43 sources were excluded. Data for tax years 2017, 2018, and 2019 consisted of information from 94 889 companies in the original dataset and 203 617 companies in the corrected dataset. Data gathered from the tax system included specific information about the com- panies per tax year, as presented in table 4. Table 4. Dataset description 1. The main line of business as a 2-digit code expressing to which specific business sector the company belongs 2. Starting date in taxation 3. Starting date in the trade register 4. Closing date in taxation * 5. Closing date in trade register * 6. Reason for closing the trade register * 7. Home municipality 8. Tax year (2017, 2018, or 2019) ** 9. Net sales per year (€) 10. The total taxable business income per year (€) 11. Purchases, variation in stocks and inventory per year (€) 12. Total tax-deductible business costs per year (€) 13. Assets total per year (€) 14. Liabilities total per year (€) 15. Taxation estimated at some point (yes/no) 16. Taxation still estimated (yes/no) * = information expressed if available ** = tax year is defined by the closing date of the accounting period, i.e., the accounting period ends 31.1.2018, the tax year is 2018 Standard Industrial Classification TOL 2008 The main line of business is based on Standard Industrial Classification TOL 2008, which is formed from five hierarchical levels. This thesis utilizes only the first two levels. (Sta- tistics Finland, 2021) 44 Closing a limited liability company A limited liability company can be closed (dissolved) by going into liquidation by decision of the General Meeting, through a merger or demerger, bankruptcy, deregistration, or liquidation by order of the authority. (Finnish Patent and Registration Office, 2014) 45 5 Developing a knowledge creation artifact This chapter presents how ADR was utilized to form the IT artifact(s), how the creation proceeded, how the IT artifact performed, what was learned, and what kind of design principles came out of the process. This project aimed to create a suitable IT artifact for the case organization, the Finnish Tax Administration, according to the ADR research methods definition presented in chapter 4. 5.1. Problem formulation Building an IT artifact to address the identification problem on the claim for adjustment procedure rose from the researcher’s interest in machine learning and its newest ad- vances in taxation and fields similar to it. Such studies include but are not limited to Chen et al. (2011) with an automatic detection model for tax reports, Xiangyu et al. (2018) with a NN focusing on tax evasion, and Pérez López et al. (2019) with a NN identifying tax fraudulent persons. Additionally, Zhang et al. (2020) developed a novel approach, with NN identifying tax evasion from social media posts. The studies indicate that the technology has been increasingly applied in practice with encouraging outcomes. (Sein et al., 2011, p. 40) As the researcher was an employee processing the claims in the LLC’s claims of adjust- ment procedure, he had a piece of firsthand knowledge of the problem in its context. The binary nature of the problem (has been estimated or not) and the actual need to decrease the amount of processing time in the claim for adjustment procedure resulted in the researcher presenting an idea to create an IT artifact based on AI algorithms to tackle this issue. The presented idea of an IT artifact was met with interest in the case organization. The creation of such an artifact was seen as beneficial in multiple ways. It could potentially 46 increase knowledge of the problem itself, offer solutions to the problem, and increase understanding of the applicability of the said technology. Long-term commitment from both the researcher and the organization was secured with a contract. The process of carrying out taxation is expected to be performed according to the law. Since taxation is mainly automated, it has raised questions about what can be and should carry out automatically. The deputy ombudsman pointed out in 2019 that automation in taxation is not based on precise legislation, and it should only be applied to tax assess- ment but not to decision-making. This current topic and policy set limitations for the artifact and its functionality. The ethical principles for AI usage were followed in creating the artifact as well as possi- ble. The data used was monitored and analyzed, and corrections were made when nec- essary. The artifact was not an independent decision-maker as a human monitored its performance and did not function as a decision-maker. Although legal and ethical per- spectives were not the focus of this research, they provided an essential and holistic starting point for creating the artifact. The problem is presented as a binary classification problem representing a class of tax administration problems. The class represents problems that require identification. By recognizing the instances from one another and identifying underlying issues, the organ- ization could pre-emptively decrease the occurrence of these problems. Roles and responsibilities in the ADR team were set at the beginning of the project. The researcher was responsible for the research and creating the IT artifact. An analytics ex- pert and claims of adjustment procedure representative offered valuable guidance and comments on the problem and the data for the project. To create such an artifact, the researcher was handed intuitively chosen and anonymous data on LLCs from tax years 2017 to 2019. Intuitive data selection was performed by 47 professionals working in the problem area who have developed a thorough understand- ing of the problem in its context. Participating in data selection were the researcher, analytics expert, and claims of adjustment procedure representative, who also formed the ADR team. Figure 9 depicts the different viewpoints of the team members in the project. Figure 9. BIE viewpoints in the project. The IT artifact was not intended as a decision-making instrument but as a knowledge creation instrument due to legal and ethical limitations to what an automated solution is allowed to do and what is expected from an AI solution within the problem context. The information provided by the artifact would only be used proactively. Within this problem context, it could remind potential customers who are prone to not returning their tax returns according to the information provided by the artifact. Reminding could occur, for example, as a guidance text message or as an instruction letter. IT-dominant BIE-cycle was chosen as the focus of this ADR project. The researcher and the organization were interested in how well the artifact could perform as a knowledge- 48 creating pre-emptive instrument. Restrictions stemming from law and ethics were con- sidered in data retrieval and potential artifact usage contexts. The artifacts beta version was not introduced to the end-users as it was not possible within the timeframe. The IT artifact manufactured was not meant to be handed over to the organization. The artifact was tested on the data received from the organization, and its performance was analyzed with the ADR team. Figure 10 presents the blueprint for the BIE plan in the project. Figure 10. Original IT-dominant BIE plan in the tax administration ADR project 5.2. Building, intervention, evaluation The ADR team analyzed the problem and decided on which data to retrieve for the pro- ject. Data consisted of information presented in tax returns and basic information re- garding the companies. Information such as names, owners, and company identification numbers was excluded. After being handed the data, the researcher began creating the artifact. The artifact aimed to recognize the estimated companies from non-estimated 49 ones as efficiently as possible. The initial knowledge creation target was to determine if such technologies could potentially be used for taxation benefits. The performance was monitored with precision, recall, f1-score, and confusion matrix. IT artifact in this project was a piece of code analyzing data with DL libraries to deduce a correct outcome (supervised learning). The creation of the IT artifact started from scratch and was mainly based on trial and error with Keras. The characteristics of the NN were modified and tested in cycles. The data available was trimmed down to find out the variables that had the most substantial impact on the result. Keras (2021a) is a DL application programming interface written in Python on the ML platform Tensorflow. The code was written with Spyder IDE (Spyder, 2021), a scientific Python development environment. Strong theoretical background in machine learning and neural networks’ ability to enhance processes worked as the starting point for this project. The first functional version of the IT artifact, the alpha version, and its results was analyzed with the project team. The results are presented in tables 5 to 7 and figures 11 to 13. 5.2.1. Alpha Table 5 presents the results for the alpha performance with tax years 2017 test set. Pre- cision, recall, and f1-score are calculated for the non-estimated companies (0) and esti- mated companies (1). The number of companies in each group is presented under “Number of companies.” The macro average refers to the average of the NN’s perfor- mance in both classes. Tables 6 and 7 are structured the same way as table 5 way. They present the performance results attained with the artifact created with 2017 test data. 50 Table 5. Alpha, performance, 2017 test set Estimation sta- tus Precision Recall F1-score Number of compa- nies 0 0.97 0.98 0.98 26822 1 0.61 0.55 0.58 1645 macro average 0.79 0.76 0.78 28467 Figure 11 presents the CM for alphas performance on 2017 test set. As an example, pre- cision (TP / TP + FP) for class (estimation status) 1 is calculated: 905 / (905 + 569) = 0.61. Figures 12 and 13 presents how the artifact that was create with 2017 test data per- formed with 2018 and 2019 data. Figure 11. Alpha, CM, 2017 test set Table 6. Alpha, performance, 2018 Estimation sta- tus Precision Recall F1-score Number of compa- nies 0 0.98 0.98 0.98 93062 1 0.58 0.57 0.58 4920 macro average 0.78 0.77 0.78 97982 51 Figure 12. Alpha, CM, 2018 Table 7. Alpha, performance, 2019 Estimation sta- tus Precision Recall F1-score Number of compa- nies 0 0.98 0.98 0.98 96388 1 0.48 0.56 0.52 3677 macro average 0.73 0.77 0.75 100065 Figure 13. Alpha, CM, 2019 52 As the data was highly imbalanced, it proved to be an arduous task to get the neural network to recognize the estimated companies’ class (1) as well as possible. The preci- sion, recall, and f1-score for the 0-class were excellent in the alpha-version. However, the estimated class (1) results were not on an acceptable level. The neural network’s f1- score for the estimated class achieved 0.52-0.58, which was not on par with the f1-score of the 0-class’s 0.98. Reaching at least a moderate f1-score was set as a target for the beta version. The project team analyzed the performance of the alpha version and concluded that further refinements were required, and new approaches were suggested as well as dif- ferent testing scenarios. These worked as the starting point for the development of beta- version. The researcher continued the development according to the original BIE cycle. 5.2.2. Beta The researcher selected three different approaches to be tested in creating the beta ver- sion. Alpha-version worked as a starting point for the beta. The test plan for the beta version can be seen in table 8: 53 Table 8. Project team’s test plan for the beta Test approach Results Creation of new artificial variables from data Variable 1 Is the company passive and empty (no assets, liabilities, sales, purchases, or other taxable activity). Yes (1) or no (0). Using only variable 1, the neural network achieved similar results as the alpha version. Variable 2 Is the company passive but has 10 000 or more in assets? Yes (1) or no (0). This variable did not affect the performance, and acceptable results were not achieved using only this var- iable or having it as an additional variable. Creating the artifact from the tax year 2018’s data and tested on 2019 (leaving 2017 out altogether) By leaving out data from the tax year 2017, no improve- ments were achieved nor significant drops in performance. Eliminating unnecessary variables from the data By eliminating variables, a slightly better performance was reached. The remaining variables: • total taxable business income, • total tax-deductible business costs, and • net worth Results are shown in tables 9 to 11. The results of the beta version were analyzed with the ADR team. Minor improvements to the model's performance were achieved compared to alpha. Tables 9, 10, and 11 pre- sent the result the same way as the results for the alpha version were previously pre- sented. Table 9. Beta, performance, 2017 test set Estimation sta- tus Precision Recall F1-score Number of compa- nies 0 0.98 0.98 0.98 17909 1 0.62 0.59 0.60 1069 macro average 0.80 0.78 0.79 18978 54 Table 10. Beta, performance, 2018 Estimation sta- tus Precision Recall F1-score Number of compa- nies 0 0.98 0.98 0.98 93062 1 0.58 0.60 0.59 4920 macro average 0.78 0.79 0.78 97982 Table 11. Beta, performance, 2019 Estimation sta- tus Precision Recall F1-score Number of compa- nies 0 0.99 0.97 0.98 96388 1 0.49 0.64 0.55 3677 macro average 0.74 0.81 0.77 100065 The increased performance in recognizing the estimated companies (class 1) can be seen in the increased f1-score from alpha to beta in all the tax years from 2017 to 2019. Table 12 presents how f1-score increased during the artifact development from alpha to beta. F1-score %-increase refers to the relative increase in performance from alpha to beta. Table 12. Artifact’s performance increased from alpha to beta Version Tax year Precision Recall f1-score f1-score %-increase Alpha 2017 0.61 0.55 0.58 - Beta 2017 0.62 0.59 0.60 3,5% Alpha 2018 0.58 0.57 0.58 - Beta 2018 0.58 0.60 0.59 1,7% Alpha 2019 0.48 0.56 0.52 - Beta 2019 0.49 0.64 0.55 5,77% 55 The most encouraging finding was that the model's performance was maintained and slightly increased by eliminating variables. Throughout the testing phases in alpha and beta versions, the NN recognized the companies that were not estimated with an f1- score of 98%. It was expected because the data was strongly imbalanced towards the non-estimated class. Moreover, as pointed out by the claim of adjustment procedure representative, it has always been a challenge to recognize the estimated companies from others in practice. During the testing phase of the beta version, it was discovered that a problem had oc- curred in the data retrieval process. As a result, fifty percent of the data had been missing. It was decided that no additional development cycles would take place, but instead, the finalized artifact would be tested with the corrected data that now included double the amount of data. However, the relative size of the estimated class did not change, and the dataset was still largely imbalanced between the classes. Tables 13 to 15 and figures 14 to 16 present the results with the beta version of the artifact (no parameters changed) but utilizing the corrected dataset. Results are pre- sented in the same way as in alpha and beta. Table 13. Beta with additional data, performance, 2017 test set Estimation sta- tus Precision Recall F1-score Number of compa- nies 0 0.97 0.98 0.98 37320 1 0.64 0.57 0.60 2208 macro average 0.81 0.77 0.79 39528 56 Figure 14. Beta with additional data, CM, 2017 test set Table 14. Beta with additional data, performance, 2019 Estimation sta- tus Precision Recall F1-score Number of compa- nies 0 0.98 0.98 0.98 193477 1 0.59 0.59 0.59 10140 macro average 0.78 0.79 0.79 203617 Figure 15. Beta with additional data, CM, 2018 57 Table 15. Beta with additional data, performance, 2019 Estimation sta- tus Precision Recall F1-score Number of compa- nies 0 0.99 0.98 0.98 200284 1 0.51 0.63 0.56 7580 macro average 0.75 0.80 0.77 207864 Figure 16. Beta with additional data, CM, 2019 Table 16. Artifact performance comparison Version Tax year Precision Recall f1-score f1-score %-increase Beta 2017 0.62 0.59 0.60 - Beta (corrected data) 2017 0.64 0.57 0.60 0% Beta 2018 0.58 0.60 0.59 - Beta (corrected data) 2018 0.59 0.59 0.59 0% Beta 2019 0.49 0.64 0.55 - Beta (corrected data) 2019 0.51 0.63 0.56 1,8% 58 No significant improvements were achieved even though the amount of data was dou- bled, as seen in table 16. As an additional point of view for analyzing the problem, the analytics expert suggested that analyzing the data with a self-organized maps algorithm could provide valuable insight into the problem domain from UL’s perspective. It was decided that another development cycle (gamma) would be conducted by testing the performance of a SOM algorithm on the data. Therefore, changes to the original BIE plan were made. Changes and actualized BIE cycles are presented in figure 20. 5.2.3. Gamma Based on the meeting regarding the beta version, an additional cycle was conducted that would utilize a UL method to tackle the issue of recognizing estimated companies. SOM was chosen as the UL method. The artifact was built with a SOM library (Minisom, 2022) created by Vettigli (2018). In figures 17 to 19, five points representing the green and red squares and the white/green points were selected. Green squares represent points where there are no estimated tax years. A white point indicates that the data differentiates from the rest. Red circles represent companies whose tax year was estimated. A point including red and green indicates that a clear distinction between the two was not attained. The in- come, expenses, and net worth values represent the median of the values in the specific points selected. Numbers in the X and Y axes represent the coordinates of each point. For example, in figure 17, the white point with a green circle has coordinates x=11 and y=3. The mentioned points specific company-related information in median values is pre- sented in table 17. 59 Figure 17. SOM, 2017 Table 17. Datapoints, 2017 Based on the SOM map of the tax year 2017 (figure 17; table 17) and the points selected, it is evident that the SOM algorithm does perform well at recognizing the non-estimated class. Companies with high income and expenses and significant net worth fall in the green area. The area containing white represents part of the data that differs from the rest. In this case, the 76 firms in the area are most likely the largest and most active 60 companies in Finland regarding income, expenses, and net worth. The estimated class is not distinguishable. However, the areas that include red are lesser in income and ex- penses and more minor in net worth. The largest number of firms fall in point (14, 14) where the median income, expenses, and net worth are low. Figure 18. SOM, 2018 Table 18. Datapoints, 2018 61 Figure 19. SOM, 2019 Table 19. Datapoints, 2019 The SOM map performs similarly with tax years 2018 and 2019 data (figures 18 and 19). More active and wealthy companies fall into the non-estimated class, and smaller and non-active tend to fall into the red/green areas (tables 18 and 19). The white areas are prone to represent the largest and most active companies. Areas including red are lesser in income and expenses and more minor in net worth. The largest number of firms for 62 the tax year 2018 fall in point (5, 0), where the median values of income, expenses, and net worth is low. For the tax year 2019, this is point (14, 0). A clear distinction between whether a company with lesser income, expenses net worth type falls into an estimated class is not attained. SOM algorithm performed similarly to the beta version. The non-estimated class was detected more efficiently than the estimated class. SOM algorithm provided information that narrows down where the estimated companies are more likely to occur. Figure 20. Actualized IT-dominant BIE cycles in the project Due to time-related restraints, the BIE cycle was concluded with the gamma version (fi- gure 20). Multiple additional cycles could include either continuing the development of beta and gamma or creating an entirely new artifact. It is beneficial to develop and test more than one potentially suitable artifact. Results from beta and gamma provide infor- mation that an AI artifact has potential as a pre-emptive, knowledge-creating analysis instrument. Moreover, information created by two different algorithms strengthens and ratifies one another. End-user participation and testing in practice were absent from the 63 project as they were not possible to conduct. In addition, they are left for future research and projects that would advance artifacts of which development began in this project. This project's AI-empowered analysis algorithms, such as the beta and gamma versions, are potential tools for decreasing irregular taxation-related behavior. (Sein et al., 2011, p. 44) 5.3. Reflection and learning The artifact creation followed the principles laid out by ADR, considering project-related restrictions in time and scope. The researcher's dual roles (the research itself and the artifact's creation) slowed down the process. Having another person responsible for the actual artifact creation would have benefited the project. The project would have re- quired more focus on setting up roles and responsibilities. However, the communication within the project team functioned well. The artifact's creation followed the IT-dominant BIE, and the whole project was continuously evaluated (stages 1-3). As a result of the reassessment conducted after the beta versions evaluation, a modification to the origi- nal BIE plan was added during the project. The concept for the artifact stemmed from theory (theory-based artifact), and its crea- tion was motivated by practice, thus creating a link between the two. Concurrent and authentic analysis of the artifact’s performance was vital for the process. Everybody in- volved in the project could suggest and affect the direction of the artifact development. The object of the project was to create an IT artifact based on AI capable of identifying companies not returning their tax returns. After being handed the data, the researcher started creating the code in Python, following Keras guidelines. It was decided to analyze the model's performance with a confusion matrix, precision, recall, and F1-score. The initial artifacts creation ended with the beta version. 64 The development process from alpha to beta and gamma benefited from interventions as insightful ideas to test out helped shape the form of the artifact. As a result, from an intervention, the idea of creating an entirely different artifact for the same problem was initialized, resulting in the artifact's gamma version. The change emerged from the re- search methods guided emergence principle, which expects the project team to be sen- sitive toward expected and unexpected consequences and act on refinement needs even though substantial changes would occur. The Gamma version approached the problem by analyzing the data with a SOM algo- rithm instead of the original vision. Gamma achieved similar results to beta by recogniz- ing non-estimated companies and having issues drawing a distinct line between esti- mated and some non-estimated companies. The creation of gamma pinpoints the need to approach problems with more than one AI algorithm allowing users to attain knowledge from several different algorithms that build on one another. The project followed ADR principles according to the IT dominant BIE with certain limi- tations. End-user testing was left out since the project was concluded with a gamma version. More development would be required to reach a finalized artifact that could be deployed to end-users. 5.4. Formalization of learning The fourth stage in the ADR project required a change from specifics to generalization, divided into three levels according to the research method. The problem that ignited the project is that of a default recognition utilizing historical data (a generalization of the problem instance). The presented solution to the generalized problem is an AI-empowered instrument that is fed historical data and, based on that, attempts to classify the customers into default- ing and non-defaulting ones. This insight provided by the artifact is used in decision- 65 making. The knowledge-creating instrument offers a basis for pre-emptive actions as suggested in this project (a generalization of the solution instance). (Sein et al., 2011, p. 45) The design principles were formed based on the answers to the research questions and how the artifact performed in practice. The research questions answered: 1. How can AI be deployed to the case organization to create value in its current taxation system? Value is created in the form of time, speed, and liberated resources. AI-empowered ins- truments can analyze and form inferences significantly faster than a human could. There- fore such a solution would free time for human(s) to concentrate on more urgent matters. It is left for human(s) to analyze the results provided by AI to see if it is applicable. The ADR project in this thesis proved that a NN and a SOM algorithm could detect estimated (defaulting) companies even though room for improvements was left. As the SOM analysis proved, AI is also better at detecting patterns and segmenting cus- tomers into corresponding sectors. Finland's most valuable and active companies are likely not among the SOM map's estimated (defaulting) companies area. Added value can be created by utilizing AI to help decision-making if boundaries on where and how to use the information are well examined. Moreover, it is required that AI does not func- tion as a decision-maker, and a human is always behind every tax-related decision. The information provided by AI in the problem context either provides specific infor- mation on companies within a particular sector as in the SOM analysis (descriptive) or information on whether the company will be estimated (imperative). To gain the most out of AI usage, it is suggested to approach a problem with more than one artifact. Doing this makes it possible to attain a more comprehensive view of the problem and its po- tential solution. 66 2. How can AI be deployed so that it does not violate rules and regulations? Challenges in using AI in taxation can be divided into three levels : 1. Trustworthiness through accuracy To be utilized as a knowledge creation instrument, adequate performance and accuracy are required and expected. Thorough testing, analysis, and continuous refinement are mandatory. Unless an organization-defined acceptable performance for an AI solution is not achieved, the solution should be discarded, and a new approach should be taken. 2. Legal and ethical restrictions and limitations of use Legal and ethical perspectives should always be considered when developing an AI solu- tion. According to the Finnish Tax Administration’s ethical principles for AI: AI should only use reliable data, follow laws and legislation, and constantly be monitored and managed by a human. Legal and ethical matters need to be addressed and considered when buil- ding AI solutions. They set restrictions, expectations, and requirements for an AI solution. A transparent and regulated AI solution is expected. 3. Justification of usage A preliminary inspection on where to use AI solutions should be conducted to deduce if significant improvements are achievable. Restrictions and limitations of use should al- ways be taken into account. Problems with a lot of data available might make a desirable use case for AI. Organizations should prepare a few different AI approaches to tackle an issue that could potentially be solved entirely or partially with AI. Additionally, as the cost of using AI has decreased, organizations ought to have a low threshold for experi- menting with them. Lastly, seven meetings were held concerning the ADR project. Presentation and evalua- tion of alpha and beta versions formed the project's core. Outcomes achieved in the 67 project were shared with the organization, including the gamma version and its findings. Encouraging results pave the way for future projects and the development of similar so- lutions. Dissemination of results was left out as the ADR process was not finished, and a finalized product was not created. The preliminary design principles derived from research outcomes achieved until gamma iteration are presented in table 20. Table 20. A preliminary set of design principles Design principle Description Trustworthiness through accuracy Trust is achieved only by producing accu- rate results. Organizations decide which is the acceptable level of accuracy. Legal and ethical restrictions and limita- tions of use Boundaries set by legal and ethical view- points should be an integral part of the development from the beginning and continuously evaluated to achieve sus- tainable and transparent use of AI. Justification of usage A preliminary investigation of the prob- lem and possible AI solutions should be undertaken to determine if significant benefits are attainable. A summary of the ADR process focusing on creating a pre-emptive artifact is shown in table 21. Table 21 is an adapted table based on Sein et al. (2011, p. 51). 68 Table 21. Summary of the ADR process Summary of the ADR Process in the pre-emptive artifact for tax project Stages and Principles Artifact Stage 1: Problem Formulation Principle 1: Practice- Inspired Research Practical challenges in the case organi- zation and the willingness to explore novel solutions worked as a launching point. Recognition: Interest in utilizing AI solu- tions in taxation has grown. A transpar- ent and holistic approach is vital. Simple problems with plenty of data are poten- tial use cases such as recognizing esti- mated companies. Principle 2: