Data Migration: Definition, Strategy and Tools

Person managing product information

Data migration is the process of transferring data from one storage system or computing environment to another or moving data from one location to another, one format to another, or one application to another. You may need data migration to replace servers or storage devices or consolidate or decommission the data center. Data migration is also an essential step in the overall process of migrating on-premises IT infrastructure to a cloud computing environment to not only modernize the data warehouse and better address modern analytics use cases, but also to gain cost efficiencies. Data migration is needed to consolidate data from various sources into one central repository where multiple divisions of the organization can access it. It usually occurs after an acquisition, specifically if the systems from different companies need to be combined – or if the systems are siloed throughout the organization. Whether you’re moving to a public cloud, private cloud, hybrid cloud, or multi-cloud environment, you’ll need a secure, cost-effective, and efficient method of migrating your data to its new storage location.

How to choose the right approach for data migration

Choosing the right approach to data migration is the first step in ensuring your project will run smoothly, with no severe delays.

Big bang data migration

In a big bang data migration, the entire transfer is completed within a limited window of time. Systems are down and unavailable for users so long as data moves and undergoes transformations to meet the requirements of a target infrastructure. The migration is typically executed during a legal holiday or weekend when customers presumably don’t use the application. The big bang approach allows you to complete data migration in the shortest possible time and simultaneously saves the hassle of working across old and new systems.

The pros of this approach are it is less costly, less complex, takes less time, and all changes happen once. The cons are it has a high risk of expensive failure, requires downtime, and may affect customer loyalty. The big bang approach fits small companies or businesses working with small amounts of data. It doesn’t work for mission-critical applications that must be available 24/7.

Trickle data migration or phased data migration

Trickle data migrations, in contrast, complete the migration process in phases. The old system and the new are run in parallel during implementation, which eliminates downtime or operational interruptions. Processes running in real time can keep data continuously migrating. The pros of this approach are it is less prone to unexpected failures and requires zero downtime. The disadvantages of this approach are that it is more expensive, takes more time, and requires extra effort and resources to keep two systems running. Trickle data migration is the right choice for medium and large enterprises that can’t afford extended downtime but have enough expertise to face technological challenges.

Data migration tools

Today, there are plenty of tools available to facilitate enterprise data migrations. These include vendor-specific solutions offered by cloud providers to support their customers’ move into their public or private cloud environment as well as licensed and open-source tools.

Your data migration strategy or desired business outcome will determine which tools work best for your project. Whether you want to consolidate a range of data sources into one, you are replacing, maintaining, or upgrading server or storage equipment, you are relocating from one data center to another, or you are recovering data from a damaged or compromised source, your organization will need a reliable data migration tool to do it.

There are generally three choices when choosing which data migration tools to use, which depend on the requirements and circumstances of the data user:

Questions to ask when selecting a data migration tool

Migrating an entire data center environment to the cloud or another location is a large-scale, comprehensive process. Completing such a data migration project successfully requires careful planning and coordination with minimal downtime or disruption to operations.

Here are a few questions to help you choose the right data migration tool for you:

  1. Location: Do you want to migrate data on-premises (in the same environment)? Or, do you want to move data from on-premises to the cloud? Or from one cloud storage to another cloud store? This will help you determine which group of tools to consider.
  2. Source and target environments: What subset of data do we need to move? Are there any peculiarities or variations in how the system has been used? Will the same operating system be running in both environments? Will database schemas or other formatting need to change? Do data quality issues need to be addressed before the migration?
  3. Data and user: Who uses the data now, who will use it in the future, and how will it be used? Data that’s leveraged for analytics, for example, may have very different storage and formatting requirements than data being retained for regulatory compliance. Be sure to gather information from all relevant stakeholders and business units throughout the data migration process.
  4. Business requirements and potential impact: What kind of migration timeline is necessary? If a data center is being decommissioned, when will its lease expire? What types of data security must you maintain throughout the migration process? Is any data loss or corruption tolerable, and if so, how much? How would delays or unexpected stumbling blocks affect the business?
  5. Cost: Is cost-effectiveness a priority? Using a cloud-based data migration tool can help you save significantly on infrastructure and human resources costs, freeing up resources for other projects.
  6. Data volume: How much data needs to be migrated? Whether it is migrating fewer than ten terabytes (TB) of data, shipping the data to its new storage location on a client-provided storage device is often the simplest and most cost-effective method. For transfers involving more significant amounts of data—say, up to multiple petabytes (PB)—a specialized data migration device supplied by your cloud provider can be the most convenient and affordable option. Or you could use an online data migration tool or a cloud data ingestion tool to easily move and ingest massive amounts of data quickly to either a cloud data warehouse or a cloud data lake.
  7. Data model: Do you need to change your data model? You may be moving from an on-premises data warehouse to a cloud-based one, or you may be moving from relational data to a mix of structured and unstructured data. Cloud-based data migration tools tend to support the widest variety of data models, whereas on-premises tools tend to be the least flexible.
  8. Data quality: Are there requirements for cleansing data, running rules against the source data, or loading the data into the target? What is the workflow for improving data quality to satisfy data governance regulations?
  9. Data transformation: Do you need to transform (enrich, cleanse, merge, etc.) your data? Because you will be adding or changing data sources, you will almost certainly need to transform your data as part of the migration process. All data migration tools can transform data, but cloud-based tools tend to be the most flexible, supporting the broadest range of data types.
  10. Security: Is any of the data you are migrating sensitive? If you want to migrate sensitive data, it is subject to compliance requirements, which can be hard to support during migration. Cloud-based tools are likely to be highly secure and compliance-certified. On-premises solutions depend on the security of your overall infrastructure.

5 critical capabilities of a data migration tool

Building out data migration tools from scratch, and coding them by hand, is challenging and incredibly time-consuming. Data tools that simplify migration are more efficient and cost-effective. When you start your search for a software solution, look for these five critical capabilities:

Data migration checklist

1. Define your goal state

Don’t start with a vendor and work backward to shoehorn your data needs into their cloud data warehouse, cloud data lake, or both. Instead, determine what your business goals call for and choose the solution that best meets your needs now while also offering the extensibility to support growth without forcing you to rip and replace previous work. The right cloud data management solution will support your data migration to whichever vendor you choose.

2. Catalog your data

An intelligent data catalog – one that provides insight into what data you have, where it is, what is in current use, and how it needs to be protected – establishes your starting point and makes it easier to find and access specific data as required. When you can quickly identify high-value data and prioritize moving to your new cloud data warehouse or data lake, your data consumers can start using the latest technology right away while your development team backfills without disruptions.

3. Standardize and cleanse your data

The more attention you pay to data quality and governance before your cloud data migration, the less work you’ll have to do to prepare your data for analysis in the cloud. Look for an extensive set of pre-built data quality rules that let you cleanse, standardize, and enrich all data without coding to ensure that your data users can trust the data they receive and analyze.

4. Manage your metadata

Metadata management is key to automating the process of discovering, tagging, relating, and provisioning data into your cloud data warehouse or data lake. Choose a solution that can scan all enterprise systems and collect all technical, business, operational, infrastructure, and usage metadata – from database schemas and glossary terms to volume metrics and user access patterns.

In addition, your metadata management solution should be able to curate your metadata, augment it with business context, and infer data lineage and relationships between entities.

Data Migration Success Stories

Abu Dhabi Department of Culture and Tourism

Abu Dhabi DCT (DCT) regulates, develops, and promotes the emirate of Abu Dhabi. It has ambitious goals for increasing visits to the emirate by integrating and aggregating tourism data from over 100 sources and stakeholders, including legacy systems and databases, hotels, malls, tourist attractions, Wi-Fi hotspots, TripAdvisor, and cultural sites.

The Informatica Data migration platform helped DCT to rapidly onboard tourism data from disparate source systems into a data warehouse in just four months, so they could gain a 360 view of the visitors, democratize data for all users and enable advanced analytics.

Equinix, a provider of a single, global interconnected platform that enables customers to deploy infrastructure and services directly and privately to mission-critical clouds, services, and networks, needed to consolidate all data-related systems into a centralized data platform to improve data access and reduce time to market for new solutions and features to customers.

Informatica Intelligent Cloud Services, Informatica’s integration platform as a service (iPaaS) solution, drove value by supporting the deployment of the GCP data analytics platform by migrating and unifying data from several data warehouses and analytics solutions onto its central Google Cloud platform to deliver real-time insights and democratize access to data for all users.

The fourth-largest bank in the Philippines, LANDBANK wanted to improve customer service, lower data management costs, enhance decision-making, and ensure regulatory compliance. Cloud data migration lets the bank replace a variety of house-built and third-party on-premises legacy systems for finance, accounting, and supply chain management with a single comprehensive enterprise solution that generates more timely, accurate, and reliable business intelligence.

Global transportation company

A global shipping company wanted to improve and accelerate data governance for faster, more accurate reporting on the location and performance of its ships and cargo. It migrated data from booking systems, container terminals, financial systems, ships at sea, and more to a cloud data lake. As a result, its machine learning models can now choose routes and ports and forecast demand more accurately and 10 times faster.

2-1-1 San Diego connects local residents with community, health, and disaster services through a 24/7 phone hotline and online database. To provide faster access to services, the nonprofit needed a way to integrate data in multiple formats from 1,200 partner agencies. Data migration allowed 2-1-1 San Diego to create a single “golden record” for every partner and every caller – —thus reducing the time and cost of each call, improving coordination and results tracking, and ensuring better service outcomes for more than 100,000 people, with room to grow to 1 million or more.

Get started with data migration today

Informatica supports your successful data migration with a next-generation, AI-powered Cloud Data Integration platform that offers best-of-breed ETL(Extract, transform, Load), ELT (Extract, Load, Transform)and Spark-based serverless processing to enable you to build and run advanced, complex integrations at scale.

With Informatica’s cloud data integration platform, you can easily ingest, integrate, transform your data and load into any of the leading cloud platforms including AWS, Microsoft Azure, Snowflake, Databricks, and Google Cloud.

See how broad and deep out-of-the-box connectivity, prebuilt advanced transformations, and zero-code orchestrations help you build enterprise integration workloads—fast. Start your free, 30-day trial now.