SKAT drives smarter tax collection with predictive modelling

Assessing the behaviour of individual debtors and adjusting collection methods accordingly

Published on 14-Mar-2012

Validated on 16 Dec 2013

"We’re very pleased with the accuracy of the model – a prediction index of 70 percent is comparable with the results achieved by most banks." - Mads Krogh Nielsen, Project Manager for Data Mining and Analytics, SKAT



Deployment country:

BA - Business Analytics, BA - Business Intelligence, Optimizing IT, Smarter Planet

Smarter Planet:
Smarter Government


SKAT is the Danish government body responsible for collecting taxes and duties. Its mission is to ensure a fair and effective financing of the public sector in Denmark. It employs approximately 7,500 people, of which 1,500 are occupied with collection, and is responsible for more than 250 different types of public debts – from income tax and customs duties down to train tickets and TV licences.

Business need:
SKAT, the Danish tax and customs agency, saw an opportunity to significantly simplify the collection of more than €8.5 billion in arrears from over 600,000 citizens and businesses (2009 figures). If it could find a way to segment the debtors into groups based on their ability and willingness to pay, it could tailor its collection processes for each group and significantly reduce workload for the collection team.

SKAT is building a solution that uses IBM Business Analytics technologies to assess every debtor against a statistical model that predicts their probable payment behaviour. Based on the scores generated by the model, the debtors are sorted into different collection processing channels. The initiative has also led to a number of useful side-projects, resulting in the creation of predictive models for late VAT payments and bankruptcies.

When the whole project is complete, it should deliver the equivalent of an 18 percent reduction in workload for the collections department, enabling approximately 134 employees to be redeployed to more valuable roles.

Provides new insight into how people and businesses settle their debts, helping SKAT design more effective tax collection processes that are tailored to different types of debtors. Automates the sorting and prioritisation of debtors, and enables fully automated collections for the majority of debtors. This will eliminate many hours of manual work for SKAT’s 1,500 tax collectors, half of whom deal with unpaid arrears. Simplifies the VAT process by predicting late payers and sending them automatic reminders.

Case Study

To read a Danish version of this case study, click here.

Smarter Government: Matching collection methods to debtor profiles

Instrumented: Arrears data from state and municipal tax assessments, train ticketing systems, television licences, police records and numerous other sources is collected into a central database for analysis.

Interconnected: The data is fed into a range of statistical models which score each debtor and segment them into groups. Each group then passes through a collection process designed for its specific circumstances.

Intelligent: SKAT gains a significantly more accurate idea of which debtors are likely to pay quickly, and which will require a more proactive approach. As a result, many of the easy collections can be handled by a low-cost, fully automated process, giving the organization more resources to focus on the worst debtors.

SKAT is the Danish government body responsible for collecting taxes and duties. Its mission is to ensure a fair and effective financing of the public sector in Denmark. It employs approximately 7,500 people, of which 1,500 are occupied with collection, and is responsible for more than 250 different types of public debts – from income tax and customs duties down to train tickets and TV licences.

“Currently, we are targeting the collection of approximately €8.5 billion in arrears from more than 600,000 debtors, including individual citizens, sole traders and businesses,” comments Mads Krogh Nielsen, Project Manager for Data Mining and Analytics at SKAT. “It is a huge and very complicated job, and collection is mostly a manual process. Our 1,500 tax collectors – half of whom deal with unpaid arrears – have to review and prioritise each case and decide how best to persuade the debtor to pay. There are some business rules to help with segmentation, but it still takes a long time to determine the best course of action, let alone to actually collect the debt.”

Modernising the collections process
Reports and budgetary analysis from the National Audit department suggested that SKAT needed to modernise its IT landscape in order to streamline the collections process, so the organization began looking for IT solutions that could help to tackle the problem.

“The assessment of taxes in Denmark has been a streamlined, online process for several years, so we began to question why the collections side of the process was still so manual,” says Mads Krogh Nielsen. “We realised that it was a major problem to sort and prioritise the debtors in the first place. If we could find an automated way to identify which debtors were likely to pay their bills quickly, and which would require a more proactive approach, we would be able to tailor the rest of the process accordingly.

“In most cases, a few letters, emails or text messages will probably be enough to remind a debtor to pay; this can be handled completely automatically. In the harder cases, we might need to visit them personally or even use surveillance, which will still require skilled human resources. But by using the right methods on the right people, we knew we would be able to substantially increase the proportion of automated collections and reduce the time and effort needed to settle the arrears.”

Predicting the behaviour of debtors
The challenge, then, was to find a way to automatically predict the likely behaviour of a given debtor, and then assign them to a specific ‘track’ that would handle the collection process in the most appropriate way. This was where IBM Business Analytics came in.

“We asked three technology vendors for their recommendations, and two of them suggested IBM SPSS Modeler,” explains Mads Krogh Nielsen.

“We were impressed with the potential of the software itself and its ability to integrate easily with our existing business intelligence systems, which were SAP BusinessObjects and Sybase. We decided to go ahead with the project and set up our own team of six people who would work with the external consultants to create the statistical models that would help us to predict debtor behaviour. The dedication of internal resources was important because we wanted to learn as much as we could about the SPSS solution so that we could manage and develop it ourselves in the future.”

Building predictive models
The project team created three sub-models: one for individuals, one for sole traders, and one for businesses. The first stage was to gather all the information that might be relevant to predicting debtor behaviour. For the sub-model for individuals, this produced 2,700 variables, including everything from the size of the debtor’s family to the make and model of car they drive. Next, the team tried to discover which of these variables were most significant. They used univariate analysis to reduce the number of variables from 2,700 to 400, and then employed special techniques in the SPSS software to get from 400 to 150. A group of business analysts and statistics experts then selected about 70 of these variables, and finally the team used multivariate analyses to select the final 15 that were used in the first version of the model.

Gaining new insights
“For the first time, we had a scientific basis for deciding which factors should affect the way we handle collections,” says Mads Krogh Nielsen.

“Many of our findings confirmed what we already suspected – for example, that people who own a lot of assets or have large mortgages tend to be good payers, while people who have small amounts of debt tend to be worse payers. But some of our findings were quite surprising: for example, we found owning a large number of cars significantly reduces the risk of being a bad debtor. This is probably because car dealers run their own credit checks on customers, so if someone has been able to purchase several cars, it’s likely that they have a good credit history. Another interesting insight really helped us understand the behaviour of some of the less affluent debtors. If you are wealthy, it often means you have a well established credit record, with mortgages, savings, assets and other debts and credits that we get tracked by our systems – so there’s a lot of data that we can use to predict your behaviour.

“However, if you rent your apartment and you don’t have a car or a bank account or a credit card, the amount of data that we can obtain is much smaller, and your behaviour is harder for us to predict. With the new solution, we have discovered that unpaid train tickets and TV licences are strong indicators of bad payment behaviour, so this helps us decide how to handle these cases.”

Accurate predictions
The SKAT team is not using the models in real collection situations yet, because some of the other aspects of the project are still underway. However, it has tested the model (which was originally created using data from 2009) on subsequent years’ data, and has demonstrated a prediction index of 70 percent.

“We’re very pleased with the accuracy of the model – a prediction index of 70 percent is comparable with the results achieved by most banks,” states Mads Krogh Nielsen.

“Moreover, banks have the luxury of being able to turn customers away if they think they are an unpredictable credit risk, whereas our model needs to handle every citizen and business in Denmark – so this makes our results even more impressive. We’re now planning to run a pilot project in one region, running the model in parallel with the current manual process and comparing results. This should help us to optimise the model and make it even more accurate.”

Making use of new skills
With the knowledge and skills they gained from the creation of the three main submodels, the SKAT team embarked on several side-projects, creating predictive models for other areas of the organisation’s operations.

“The first was a bankruptcy prediction model – we used the skills learnt from making the main collections model to create a bankruptcy prediction model. Many of the variables were re-used and some new were added. The result was a model in which 92 percent of all companies that went bankrupt in 2009 was predicted to be in the upper 20 percent of the bankruptcy forecast model,” says Mads Krogh Nielsen. “This means that only 20 percent of the cases are necessary to run through, in order to identify 92 percent of bankruptcies. This is a significant advantage because if we think a company is likely to go bankrupt in the near future, we can lay claim to some of their assets to ensure that their arrears are covered.”

A second project involved the creation of a model to predict late payers of VAT (which is known in Danish as ‘merværdiafgift’ or ‘MOMS’). Late payment of this tax is a problem for SKAT because it has to estimate what the correct payment should have been for each company, which is a complex and labour-intensive process. The new solution enables SKAT to identify the companies that are most likely to pay late, and send them reminders ahead of the due date.

“We’re now working on a third project to predict the workload for our telephone helpdesk at different times of the day, month and year,” comments Mads Krogh Nielsen. “This will be very valuable in terms of helping to gauge the required staffing levels, and should hopefully enable us to make further cost savings by deploying our human resources more effectively.”

Looking forward to significant cost benefits
SKAT is still working on implementing some of the downstream aspects of the new collections process, so the models are not yet being fully utilised. However, the organization is confident that the IBM predictive analytics solution will ultimately deliver significant benefits.

“We have seen a similar initiative in Norway which resulted in 96 percent of all collections being fully automated in the first year of deployment, and 99 percent in subsequent years,” comments Mads Krogh Nielsen. “If we manage to achieve similar results, it will be a huge win for our organisation.

“Our project is part of a larger initiative that will also involve significant process re-engineering and the introduction of a new case handling system. According to the estimate of the Danish state auditors, when the whole initiative is complete it should reduce the department’s workload by the equivalent of approximately 134 full-time employees, which will allow us to redeploy staff into more valuable roles. That equates to an 18 percent reduction in total workload for the collections team, which is a very significant saving.”

About IBM Business Analytics
IBM Business Analytics software delivers actionable insights decision-makers need to achieve better business performance. IBM offers a comprehensive, unified portfolio of business intelligence, predictive and advanced analytics, financial performance and strategy management, governance, risk and compliance and analytic applications.

With IBM software, companies can spot trends, patterns and anomalies, compare “what if” scenarios, predict potential threats and opportunities, identify and manage key business risks and plan, budget and forecast resources. With these deep analytic capabilities our customers around the world can better understand, anticipate and shape business outcomes. For more information please visit:

Products and services used

IBM products and services that were used in this case study.

SPSS Modeler, SPSS Collaboration and Deployment Services

Software Services for Business Analytics

Legal Information

© Copyright IBM Corporation 2012 IBM Danmark ApS Nymøllevej 91 2800 Kgs. Lyngby Denmark Produced in Denmark March 2012 IBM, the IBM logo,, Let’s Build A Smarter Planet, Smarter Planet, the planet icons and SPSS are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. A current list of other IBM trademarks is available on the Web at “Copyright and trademark information” at Other company, product or service names may be trademarks, or service marks of others. References in this publication to IBM products, programs or services do not imply that IBM intends to make these available in all countries in which IBM operates. Any reference to an IBM product, program or service is not intended to imply that only IBM’s product, program or service may be used. Any functionally equivalent product, program or service may be used instead. All customer examples cited represent how some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions. IBM hardware products are manufactured from new parts, or new and used parts. In some cases, the hardware product may not be new and may have been previously installed. Regardless, IBM warranty terms apply. This publication is for general guidance only. Photographs may show design models.