Tax fraud directly and negatively affects business market conditions by creating unfair competition. Compared to competitors who do not pay taxes, companies that operate in accordance with the law have higher costs, and therefore, higher prices for products and services. Moreover, tax fraud, from the perception of the individual citizens, causes the reduction of basic human rights (local infrastructure, health care, pensions…). Tax fraud reduces tax collection and thereby reduces the level of public service quality and room for its improvement.
Detecting taxpayer fraud is the most difficult step in fiscal control. One of the primary goals of tax control is to monitor and check the financial operations of large companies/corporations as they represent the biggest risk bearers in the tax fraud field. At the same time, small and micro companies who perform tax fraud do it mostly through cash transactions related to income and expenses. By using various independent sources, or by matching data and checking with other sources, tax authorities can determine taxpayers’ malicious actions.
Also, the determination of taxpayers’ criminal actions requires a field tax inspection (randomly or on request), which requires a lot of time and financial resources from the tax administration (concerning the number of inspectors and additional material resources). It is a very difficult, and ultimately impossible, task for the tax administration to solely rely on field inspections for the factual verification of tax compliance. In this respect, the application of some software tools would further ease audit work which would, in turn, help determine possible embezzlement, without violating the relationship of trust between the taxpayers and the administration itself.
TaxCore’s® advantages
TaxCore®, as a taxpayer monitoring system, records all significant elements of every single fiscal transaction, which enables tax authorities to monitor tax collection and transaction records. The platform is based on taxpayers’ identity and data protection. It is very intuitive, allowing tax officials to search taxpayers according to several parameters. This enables tax officials to easily locate and use important information, monitor trends in the taxpayer’s business, etc. This software solution provides the tax administration with notifications about each commercial transaction in a way that enables risk analysis and remote verification.
One of TaxCore’s® advantages is the unification of the tax authority’s data for all taxpayers in a country. In this way, TaxCore® can form a large important database, from which it can generate not just meaningful and significant results/reports, but also future predictions about potential tax frauds. The platform does not achieve the unification of all taxpayer’s transaction data through centralization, in the sense of storing data in one place, but rather under the form of complete information. Accumulated data enables comprehensiveness because if we group the information itself, as well as its importance, it then has a greater and better application in various analyses, which is, in turn, used for some future predictions.
Tax inspection efficiency could be improved by applying a new approach in which the first step would be to define basic parameters by using the DATA MINING process. The objective is to develop algorithms to detect tax violations by using advanced methods of analyzing big data and artificial intelligence with the help of machine learning. Big data are considered a new type of resource, in terms of business assets, and are used to improve business processes and increase productivity for the next period.
Extracting Patterns and Models
According to the available literature, it estimates that, in addition to the financial sector, the biggest area in which the economy, as a whole, will increase productivity, thanks to big data, is the public sector, i.e., the state administration. The application of machine learning methods to large databases in the tax administration would provide valuable insights into the historical behavior of taxpayers. Based on this, we could obtain recommendations for field inspections. In addition to improving the effectiveness of field inspections, this approach would also enable the creation of risk categories for taxpayers. The idea is to create forms based on historical indicators of certain attributes of the taxpayer, according to which, and depending on the degree of matching with the forms, the platform could assign a taxpayer with a certain level of risk.
Extracting patterns of possible scenarios for tax fraud would be based on the taxpayers’ historical behavior and would work by monitoring certain (predefined) attributes and relying on some of the models from the literature (artificial neural networks, Bayesian networks, logarithmic regression…). It is necessary to categorize the risk of the taxpayer into, for instance, low/medium/high. Depending on the degree of coincidence (probability) with the defined tax fraud scenarios, the taxpayer would receive a certain categorization of importance based on the level of risk. One of the benefits of developing risk indicators for individual taxpayers is the possibility of using these indicators to rank all taxpayers according to a defined risk level.
Machine learning
The platform could use Machine learning to discover patterns and relationships among attributes that are useful for identifying taxpayers’ “suspicious” behavior. It would be used to select taxpayers who are suspicious and, as such, would be forwarded to inspectors for further checks. The goal of this approach is to increase the productivity of tax inspectors in the field and to regain tax revenue losses. Compared to the manual search method, this data mining technique is a more modern (scientific) approach that would spare resources and avoid personal judgments in selecting “suspicious” taxpayers.
How to separate anomalies
The basic starting point for determining future potential tax frauds is to distinguish intentional fraud from an accidental mistake made by the taxpayer. The term “fraud” refers to all cases of willful non-compliance with tax regulations. We often depict Tax fraud as a synonym for tax violation. However, tax violation includes all cases when the taxpayers have not settled their tax liability. On the other hand, tax fraud represents the taxpayer’s intention to circumvent the law to avoid paying taxes. Thus, tax frauds are a subset of a set of tax violations. The basic starting point for identifying tax fraud lies in distinguishing intentional (fraud) from accidental (mistake) irregularities. The delineation would be based on all realistic scenarios of tax violations that are based on historical data, with the main indicator of distinguishing intent from mere coincidence – this indicator would be the frequency of violation cases.
We expect many analytical anomalies when starting data mining. High-risk anomalies should be distinguished from low-risk anomalies. Those anomalies that could be foreseen should not be taken into further consideration. Knowledge of the tax system, precisely through TaxCore®, will make it possible to separate normal and expected anomalies from those characterized as potential tax frauds.
Classifying the data
The first instance is the classification of the data we have in terms of the correctness of the data itself (negative amounts, empty fields, formats, duplicate values, uneven values…). The accuracy of the model depends on the correctness of the input data. The correctness of the input data largely depends on the accuracy of their input. This brings us back to the very beginning, that is, the correctness of the input information is the responsibility of the taxpayer, who uses the components for issuing invoices. The challenge is to define the initial conditions and attributes based on which we can create taxpayers’ risky behavior scenarios.
Malicious models of behavior should be defined, that is, rules should be formed that are in accordance with some known or hypothetical frauds. If there is no feedback from experts, we need to form synthetic data ourselves that make up for both legal and illegal transactions. If we interpret the existing data, their interconnections, their intersection, and the application of a model to that data, we should reach the goal – whether we can interpret the transaction of that taxpayer as fraud with certain accuracy.
Moreover, it is good to apply different machine learning methods (K-nearest neighbors, decision tree classifier, artificial neural networks, logistic regression…) to determine how these methods handle the input of data as well as the accuracy of the results obtained. If we forward the obtained results to the tax administration, it would determine the accuracy of the calculations by going out into the field, thus providing feedback on the (in)accuracy of the method. This could be considered the only valid confirmation of the method itself. In this regard, the method of choice would determine the most accurate results.
Supervised methods to detect fraud
If there is real historical data on proven tax frauds, the detection of future potential taxpayers who would commit fraud would be determined by using supervised methods. The application of the model would go in the direction of searching in the database for the transactions of all taxpayers, thus identifying those taxpayers who have similar characteristics (behavior) to those taxpayers proven to have committed tax fraud. If there is no knowledge or available information about existing tax frauds, we would perform data mining using unsupervised machine learning methods, although these have a lower level of precision and interpretation compared to supervised methods.
With the unsupervised method, in contrast to the supervised method, we would identify not just cases of tax fraud, but economic entities that are irregular in paying tax obligations, as well as taxpayers’ suspicious behavior. These working methods can be used in the verification work made by tax inspectors to determine tax crime. They can also be suitable for supporting risk management decisions in case of tax fraud, used to better prioritize tax controls and ensure a more efficient tax collection.
Examples from practice
The following is a brief overview of the possibilities of applying the TaxCore® solution within the current machine learning trends for predicting future potential taxpayers who would commit tax fraud. A special invoice category to pay attention to is refund invoices. According to global information, at the level of retail businesses, employees commit as much as 28% of all fraud through issuing refunds. For this reason, when looking for possible tax fraud, we should place emphasis on refunds. Companies should pay special attention to employees who possess additional credentials (e.g., managers) because they have privileges for additional discounts, coupons for subsequent purchases, and the like. Some of the possible ways to track refunds are:
-
- Scenario 1: Monitoring the total number of refunds within the total number of taxpayer’s issued invoices, compared by all employees. What is the frequency of occurrences, monitor it on a daily, weekly basis… A large number of refunds in the taxpayer’s total sales, if often repeated – is a signal of alarm.
- Scenario 2: monitoring the number of canceled items per invoice for each taxpayer and comparing it by employees. A large number of cancellations by one taxpayer in relation to sales, if often repeated – is a signal of alarm.
- Scenario 3: Monitoring the price oscillation of an individual item where taxpayers increase the price or decrease it compared to the average, which may indicate manipulation of the reported and actual selling price, as well as an illegal increase (price gouging).
- Scenario 4: Monitoring customer reports of suspicious transactions related to a specific point of sale.
The essence of these scenarios is the repetition or frequency of the events. We cannot mark an event that never happens, once or an insignificant number of times in the observed period, as a potential tax fraud.
Risk categories
Also, it is necessary to define several mutually independent risk categories related to different concepts (weight factors of initial conditions, frequency, taxpayers’ risk categories…). Risk assessment is a very subjective process, however, if we apply certain methodologies and principles, we can reduce subjectivity to the lowest possible level. Based on the available data in the TaxCore® database, we can identify and define certain rules and risk types, as well as risk acceptability levels with expertise. Risk assessment would entail making decisions based on real data and the experience of experts. Every risk event is accompanied by its frequency. To this respect, it becomes necessary, based on the experts’ knowledge, to define frequency intervals, according to which we could delimit the taxpayers’ behavior. The essence of the problem lies in setting thresholds, both for the observed event’s risk levels, as well as for the frequency of the intervals.
TaxCore® makes it possible to track and separate the time of invoice issuance from the time of invoice reception in the database. This information is very important because by monitoring the time invoice reception, we can determine whether there is an accumulation of invoices in a determined part of the day (e.g., end of working hours), whether there are gaps during working hours, how often they repeat them, etc.
We can also check tax rates through the TaxCore® system, in terms of whether taxpayers really apply the tax rates they have declared. Also, we can determine whether, at the level of one taxpayer, we calculate several different tax rates for the same item, as well as whether there is a mixing of tax rates when issuing an invoice by a taxpayer who is issuing several tax categories.
Element trends
In addition to this, we can monitor trends of the following elements at a taxpayer level for arbitrary time intervals (daily, weekly, monthly, quarterly, at an annual level, tax calendar, beginning and end of the fiscal period or any period for submitting fiscal returns, since the period immediately before is very interesting for withdrawing money and reducing turnover):
-Number of issued invoices
-Turnover
-Tax amount
-Max and min number of issued invoices
-Cash/card ratio for types of payment
Types of transactions and their percentages by number and amount in the total number/amount of transactions (double monitoring at the level of taxpayers and taxpayer comparison with the average trend at the level of business activity), etc.
These data can be important for monitoring the degree of deviation of the trend at one taxpayer’s level in relation to the trends at the level of their business activity, according to the selected parameter.
Example 1
One of the examples of trend monitoring according to one parameter would be to determine the deviation of the trend of the number of taxpayer invoices in relation to the trend of the average number of issued invoices at the level of their business activity. We could also define deviation intervals: (1) what percentage of deviations we tolerate, (2) what thresholds would require additional investigation without declaring the taxpayer as a possible fraud perpetrator and (3) what percentage of deviations, and anything higher than that, would mark the taxpayer as a potential fraud perpetrator and would alert the tax inspectors to carry out a field inspection of that taxpayer.
Along with the deviation percentage, it is necessary to monitor its frequency; it must differ in weight (significance) of the frequency of deviations on a weekly basis from the frequency of deviations on a monthly level. For example, two deviations on a weekly basis do not have the same weight as two deviations monthly.
Example 2
Based on several examples from practice and their recorded turnover in TaxCore®, we observed that, from month to month, there is a decreasing trend in the number of issued invoices. Also, on certain working days taxpayers did not issue a single invoice (there are no recorded invoices in the system). It should be noted that this is the hospitality industry, and it is common for some industries not to have daily sales. According to these examples from practice, we can define a scenario, according to which if there is a tendency for the number of invoices to decrease and on some days, taxpayers record no sales, for both conditions met, the taxpayer enters the red zone, which means that we pass the task on to the tax inspectors for additional controls.
Every type of fraud leaves certain “traces” in the data. TaxCore® records events in real-time, leaving no possibility for subsequent, retroactive data changes that would “override” old data. This further implies that we absolutely store every change that occurs at the level of every piece of information in the database. In this way, we create big data, the search and analysis of which can determine the models of future potential misdemeanor actions. Humans are not able to manually analyze large databases, to define and extract certain patterns and scenarios based on the data. Advanced machine learning methods are ideal for mining large databases and for identifying scenarios. Patterns by which machine learning methods would search the data can be defined based on real historical indicators belonging to taxpayers who have committed tax frauds or by forming fictitious tax fraud scenarios. We can use Artificial intelligence methods in both directions.
Artificial Intelligence methods
The first direction is in defining and singling out taxpayers who have committed some tax violations, and the second direction is to use them as tools for committing fraud. Namely, fake data sets can be created, depending on the tax categories, which would be in turn trained with AI algorithms to achieve the desired level of matching accuracy with real data. Such a set of data, which is identical to real data, can be used for tax information that would mislead tax inspectors. In this respect, it is necessary to include advanced technologies in the taxpayer monitoring process to prevent and, if possible, control the overrepresentation of AI in tax fraud.
Given that in the current globalization and development conditions of IT technology, the number of risks increases drastically, the financial stability of the country must define measures and approaches for determining potential tax frauds. In this regard, adaptation to the world’s financial turmoil becomes necessary, as well as considering a modern approach to identifying tax fraud. TaxCore® due to its comprehensive application, innovative approach, technology, and theory, would enable tax authorities to manage future tax fraud risks which would result in further economic progress and in the growth of the country.
Text Author: Jelena Lukić, business analyst, Data Tech International, d.o.o.