When a business needs a shared data model, not just more integrations

Companies often identify an integration problem when the real issue is the data model. Orders are duplicated, customer names differ across systems, statuses do not match between ERP and ecommerce, or reports need manual cleanup before anyone can trust them. The usual reaction is to add another integration or build one more export. But when each system defines the data in its own way, the problem is shifted, not solved.

If each system interprets the same data differently, the business does not have integration: it has incompatible versions of reality.

A shared data model does not mean centralizing everything into a single database or forcing a monolithic architecture. It means agreeing on which entities are common, what they mean, and who is responsible for each change. For an SME, that can be as concrete as defining what a customer, order, return, product or invoice actually is. Without that definition, every new project ends up creating its own translation layer between systems.

Why integrations are not enough as the business grows

Integrations solve transport. Data models solve meaning. That distinction sounds theoretical, but in daily operations it becomes visible very quickly. An ERP may consider an order confirmed once stock has been reserved. An ecommerce platform may consider it sold the moment payment succeeds. Customer service may want it to remain pending until it leaves the warehouse. All systems are connected, but none of them speaks exactly the same language.

As the business grows, exceptions grow too. New channels appear, pricing rules become more complex, batches are introduced, subscriptions are launched, partial returns happen, international orders arrive, or manual steps are added for edge cases. If the company only thinks in point-to-point integrations, every new case adds another exception. The cost is not only development; it is keeping consistency when processes, catalogs or responsibilities change.

A very common symptom is the “contested master data” problem. The same customer exists in CRM, ERP and ecommerce, but not with the same structure or the same identifier. The same happens with products, warehouses or price lists. Teams then stop trusting the system and fall back to spreadsheets, email or manual checks. In the short term that feels pragmatic; in the medium term it creates a slow and fragile operation.

What a shared data model should include

The most common mistake is trying to model everything from day one. That usually delays the project and increases internal resistance. It is better to start with the entities that affect operations and reporting the most. In many companies these are customer, product, order, invoice, shipment, stock and return.

For each entity, four things should be defined:

Business name and meaning: for example, what distinguishes a confirmed order from a created or prepared one.
Unique identifier: which field or combination of fields lets all systems recognize the same entity.
System of reference: which source wins when there is a conflict.
Allowed events or states: which transitions are valid and which ones must be blocked or reviewed.

It is also worth documenting which data should not be duplicated. A typical example is the customer billing address, which is often copied into several systems and then updated in some but not others. Another frequent case is the product catalog: if commerce, sales and operations edit descriptions, units or attributes without a shared rule, the result will be inconsistent.

A shared model does not have to be rigid. It can coexist with local extensions by channel or business unit. The key is that those extensions do not break the common core. If one channel needs specific fields, they can be added as complementary attributes without redefining the base entity.

How to move from isolated systems to a common base without stopping operations

The transition should happen in stages. Trying to redesign all systems at once usually creates operational risk and team fatigue. A more realistic approach starts by mapping where the most expensive discrepancies occur: orders, customers, inventory, invoicing or reporting.

A good first step is to build a data source inventory. It does not have to be complex: just list which system creates each data element, which one changes it, where it is consumed and what happens if it is missing or late. That map often reveals that some data is created in too many places and other data has no clear owner.

The next step is to define a master flow per entity. For example, the ERP may be the reference for invoices and stock; ecommerce may be the reference for checkout states; CRM may be the reference for segments or commercial relationships. What matters is not that all data originates in one system, but that the company knows where the operational truth of each data element lives.

From there, integrations should be designed to synchronize, not reinterpret. That means validating formats, states and rules before sending data to other systems. It also means logging errors in a way people can actually use: it is not enough for an integration to “fail”; the team needs to know which record failed, why and who should correct it.

At this stage, a normalization layer is often helpful, whether through APIs, queues or an intermediate data service. Its role is not to add complexity, but to isolate change. If the ecommerce structure changes tomorrow, the company should not have to rebuild every downstream connection.

Signs it is time to invest in a shared model

There are some very clear indicators. The first is the time the team spends reconciling data across systems. If every month-end close, campaign or commercial review requires hours of manual checking, the cost is already structural.

The second indicator is inconsistency in decisions. When sales, operations and administration work with different numbers, the problem is not just reporting; it is data governance. In that situation, each area ends up defending its own version and the conversation becomes political instead of operational.

The third indicator is dependence on specific people. If only one or two people truly understand how the data fits together across systems, the business is exposed. Critical knowledge should not live in a personal spreadsheet or in one technician’s memory.

It is also worth looking at the cost of small changes. If adding a new channel, a new price rule or a new business rule requires touching five systems and reviewing ten exceptions, the current model is already penalizing growth.

A shared data model is not a trend project or an abstract consulting layer. It is a way to reduce operational friction and make integrations sustainable. For many SMEs, the right moment to address it comes earlier than expected: when the systems are working, but the business is starting to move more slowly than it needs to. At that point, Codefuente typically helps teams organize the data before adding more connections.