For multinational corporations, transfer pricing and its interaction with tax authorities garner higher and higher amounts of government attention. International tax codes, OECD guidelines, and rules are ever-evolving in both small and large ways. Governments require corporations to share and exchange ever increasing amounts of information. The days of calculating and maintaining transfer price adjustments in spreadsheets have ended.

Transfer pricing software logically breaks down into two parts: data and application. In this post, we will look at the data part of the equation. In our experience, getting the data right can be more than half the battle. For many projects obtaining the needed raw transactional data, master data, and financials in the proper format and the correct context is often a hard-fought battle.

The root cause of most unexpected results in the initial stages of a transfer pricing project is often due to the underlying data. This frequently runs counterintuitive to where most analysts expect to be spending their time in the early part of a project. Instead of jumping directly into building models, running and testing data to validate results, most technology projects bog down due to flawed data.

What are the data-side requirements of a modern data-driven transfer pricing system that provides a unified, cohesive, and repeatable set of this underlying data?

  • The data sources that provide the required raw data
  • The data formats of the data sources – numerical, textual, and date formats
  • The consistency of this incoming data – does the data format change month to month?
  • The cleanliness of the incoming information – extra spaces, missing negatives, extra columns
  • How do columns in one source map to the other sources?
  • Data completeness – required master data to add usefulness to the source data
  • Calculated and summarized columns and tables

Overall, the requirements fit what we will call a data enrichment pipeline. This is in contrast to a common data pipeline (ETL) that moves data from a source to a target system. A data enrichment pipeline ingests, aligns, and enriches the raw data before sending it to the target. For our purpose today, we will examine the first three parts of the data enrichment pipeline.


Transfer Pricing – Data Enrichment Pipeline

Data Management in Transfer Pricing


The first issue is that our raw data typically comes from many different sources. Multiple ERP systems often provide the needed transactional and ledger data, master data may come from web-enabled systems, and additional data could be sent from a tax analyst’s laptop from another part of the world. A modern transfer pricing system must connect with and retrieve data from all source systems. To be considered modern, a system should have prebuilt low-code connections that only require configuration within the application.

These connections will most likely consist of remote function call (RFC) connections to legacy ERP systems such as SAP ECC, modern database connections to databases such as S4/Hana, web-based API interface connections, and the ability to load any flat-file format. PlaidCloud provides these low-code connections right out of the box without the need to spend time and resources on building these from scratch.



The next issue is that the raw data usually has different formats, extra spaces, different headers, and a multitude of other abnormalities between sources. A modern transfer pricing system should have the ability to trim leading and trailing spaces, convert between ASCII and UTF encoding, handle NULL values, extract time, timestamps, and dates, and convert between text and numeric. PlaidCloud has these cleaning functions built-in, and a user can activate them by checking a box.

After cleaning the source data, the system must combine the input tables into a unified starting data store. A modern system must allow for the easy mapping of columns from one table to another, quickly adding or deleting columns as needed, and inbound and outbound filtering capabilities. PlaidCloud provides these capabilities in a no-code/low-code environment and can map, add and delete columns, aggregate, and filter in as little as three clicks.

Data Enrichment Flexibility

While the goal of the data enrichment process is to accomplish everything with the no-code approach, it is not always possible to handle every complex scenario easily.  Therefore, flexibility must be part of the data enrichment process so there is a way to handle any data requirement easily without customization.  This flexibility avoids solution customization which can add tremendous cost, complexity and risk to the project.  For example, PlaidCloud provides the ability to include User Defined Functions (UDFs) directly in the pipeline workflows.  These UDFs are written using common Python.  It is important that a solution that allows for code creation utilize a common language rather than a domain specific one to enable a short learning curve, broad support resources, and a larger talent pool.

The screenshot below is an illustration of a UDF to handle some complex rules around which target currency is used for each of the legal entities depending on several attributes.



We’ve discussed how important it is to have quality data as a starting point for a transfer pricing system. Project teams are often surprised by the low quality of the incoming raw data. This low-quality data often causes project delays until a high-quality starting data store is established. 

We’ve covered that a modern system will have the data ingestion, alignment, and enrichment steps built-in and configurable with clicks instead of code. These low-code built-ins speed up the process required to get the data to a place where it is unified, cohesive, repeatable, and a sufficient starting point for the build-out of the transfer pricing model.


In our next set of posts, we will discuss the application side of a modern transfer pricing system. The application side will be broken down into two articles. The following post covers model automation, rules engine, allocations, result reviewing and monitoring, and useful dashboards and reports. The last post in this series will cover suggested transfer pricing changes and retroactive price adjustments (RPA).  All of these components are included in the PlaidCloud Goods Transfer Pricing and the Intercompany Invoicing and Tax Compliance solutions.