Syngenta’s Search for a Single Source of Truth

Syngenta is a global AgTech company dedicated to helping millions of farmers around the world safely and sustainably grow high-quality food, feed, fiber, and fuel. The company’s 26,000 employees in 100 countries use world-class science to transform how crops are grown and protected. In 2020, Syngenta had $14.3B in global sales and devoted 10% of revenue to R&D.

Syngenta achieved its position as a world leader in agtech through innovation and a succession of successful mergers and acquisitions. The downside of this growth path was a fractured IT environment.

For Syngenta’s data scientist and visualization engineers, the siloed IT environment presented real problems. Like an iceberg, 20% of the company’s data from trials, R&D, and field observations were visible. But ensuring the provenance of this visible tip required navigating the 80% of arcane, siloed data and processes that were obscured from view.

Syngenta’s senior data architect found a reproducible and scalable way to integrate the silos of data with the help of CompilerWorks. Creating a set of data marts on Amazon Redshift, Syngenta assimilated data from 60 different sources, integrating the entire technology, process, and product lifecycle of the Syngenta Group. CompilerWorks Lineage provided transparency enabling data scientists to see where their data originated. Data lineage was presented in a simple, understandable way, giving confidence in the new data source.

The landscape of assimilated data, processed through CompilerWorks, let Syngenta’s data scientists focus on the business meaning of the data in a trusted, agnostic way, providing data observability without ever having to dig into the source.

To read more about how CompilerWorks Lineage helped Syngenta automate data management and gave data scientists confidence in their data, read the expanded Syngenta case study on Information Week.

Cloud Data Warehouse Accelerating Data Engineering and Cloud Transformation at ABN AMRO Bank

ABN-AMRO-Netherlands

ABN AMRO Bank N.V. is the third-largest bank in the Netherlands. Headquartered in Amsterdam, it provides financial services to more than a quarter of the Dutch population. The bank employs 19,000 people and has over $465.8 billion in assets. 

In 2019, ABM AMRO began an IT digital transformation and modernization project to grow the number of teams working in the cloud. The project included migrating a 90TB on-premise appliance-based Teradata enterprise data warehouse (EDW) to Microsoft Azure.

The data warehouse migration from Teradata EDW to Azure Data Factory’s (ADF) platform as a service (PaaS) architecture promised to optimize costs. It was also a hedge against a looming 2021 end-of-support deadline for the Teradata appliance. 

Regulatory Constraints Delay Migration Deadline

The bank identified 62 end-user groups using the Teradata EDW and asked them to re-engineer their workloads on Azure as part of the data platform modernization project. Unfortunately, many of the workloads were over ten years old and involved critical business logic with little, if any, documentation. 

The prospect of rewriting complex code from scratch was daunting. A significant number of the workloads were risk models used by the bank to satisfy data governance regulations such as Basel III/IV, ECB, DCB, AFM, and GDPR. 

Any changes to the highly regulated business logic would require regulators to re-validate the models: a time-consuming process that would delay the cloud transformation project. 

CompilerWorks Slashes Time to Re-Engineer Business Logic

ABM AMRO turned to CompilerWorks for help. Using a combination of CompilerWorks Lineage and Transpiler solutions, the bank’s DevOps teams extracted the business logic used by risk models and recreated the workloads on Azure’s cloud data warehouse. 

An on-screen comparison of the Teradata and Azure datasets demonstrated to regulators that both platforms were using the same business logic, eliminating the need to re-validate the models.

Using CompilerWorks Lineage and Transpiler solutions enabled the bank to condense four years’ development work into one year. Through an automated migration, critical workloads were successfully moved to Azure on schedule, reducing EDW’s total cost of ownership (TCO) by more than 10 million euros ($12 million) per year.

In addition to financial benefits, CompilerWorks brought clarity to ABM AMRO’s re-engineering efforts. IT teams and end-user groups acquired a much deeper understanding of the business logic used in their risk models. Retiring their Teradata EDW on schedule enabled the bank to adopt a modern cloud-based data architecture years earlier than would otherwise have been possible.

Decentralized Data Models and Improved Compliance

For the future, ABN AMRO anticipates providing end-user groups from across the enterprise access to data through a self-service data marketplace. The marketplace will include information such as data quality, ownership, and lineage over time letting data consumers prove internal and external compliance at any point.

The bank also has plans to use CompilerWorks Transpiler to support future cloud database migrations. Transpiler will let users compare translated code in different SQL dialects and perform risk assessments before migrating to a new platform.

ABM AMRO Bank key outcomes:

  • Migrate workflows from Informatica PowerCenter to Azure Data Factory pipeline
  • Slash migration time and enable user groups to meet target completion dates 
  • Enable nearly $12 million per year in IT CAPEX/OPEX savings 
  • Provide upstream and downstream automated lineage transparency 
  • Achieve one-to-one transpilation from Teradata SQL to Azure SQL 
  • Simplify data architecture by identifying unused data and resources 
  • Bring clarity to business logic, streamlining regulatory compliance through automation
  • Save time, reduce cost, and hedge risk for large-scale data pipeline migrations

Find out more about how ABN AMRO Bank uses CompilerWorks Lineage and Transpiler solutions to accelerate cloud transformation, reduce costs, and maintain regulatory compliance. Read the full customer success story here.

How Lyft’s Amundsen App is Scaling Data Discovery with CompilerWorks Lineage

Applications that provide a search service for things we need are the way of the future. One of  the most popular application categories today are ride-sharing apps like Lyft and Uber.

Founded in 2012, Lyft has quickly become one of the largest transportation networks in the United States and Canada as the world shifts away from car ownership towards transportation-as-a-service.

Their mission? To improve people’s lives with the world’s best transportation. Lyft is making good on that mission with a transportation network that includes ridesharing, bikes, scooters, car rentals, and transit all available from a single phone app.  

Challenges in Scaling the Lyft App

With so much growth, making wise use of the data flowing into the application requires technology that can support it. Lyft relies on a cloud-based development infrastructure based on Amazon Web Services (AWS), including Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (S3).

Initially, Lyft’s front-end service was dependent on Amazon’s Redshift data warehouse and Kinesis message bus as data stores but encountered issues in scaling the application to keep up with the volume of frequent users due to the tight coupling of compute and storage limitations. To resolve this, they elected to migrate from Redshift to Apache Hive on AWS cloud.

With the constant influx of new datasets from various sources, including SQL tables, Presto, Hive, Postgres, as well as dashboards in BI tools like Mode, Superset, and Tableau, Lyft had little insight into their data lineage and the impact of changes in their data flow and access.

To maintain their upward mobility they knew they needed fast, flexible access to data to power their application and services, visualize information flow, identify and monitor errors, and conduct impact analyses of changes to their data.

Lyft Creates Amundsen Tool to Improve Data Access

To provide faster access to the targeted data users need, Lyft developed a backend data discovery tool, along with co-creator Mark Grover, named Amundsen (after the Norwegian Explorer, Roald Amundsen).

Amundsen is an open-source data discovery and metadata engine that enables data science engineers and software engineers to gather necessary data from numerous pipelines into a central place and improve their productivity by up to 20%.

Amundsen data builder enables users to:

  • Search for data assets with a simple search-engine using a PageRank-inspired search algorithm that recommends results.
  • Use a metadata service to view curated data, including user information such as statistics and when a table was last updated.
  • Learn from others by seeing what data your co-workers use most, common queries, and table-based dashboards.

Data sources can include:

  • Data stores like Hive, Presto, MySQL.
  • BI/reporting tools like Tableau, Looker, and Apache Superset.
  • Events and schemas are stored in schema registries.
  • Streams like Apache Kafka, and AWS Kinesis.
  • Processing information from ETL jobs and machine learning workflows

Unfortunately, one drawback to Amundsen is that the data it pulls is represented in a static table format with little insight into where it came from and how it’s being used—like a glossary with no definitions.

Users must then try to fill in the gaps themselves through manual mapping of data lineage which can prove time-consuming and rife with error.

Improving the Data Model with CompilerWorks Lineage

To give users greater ability to trace the lineage of data from its various sources in Amundsen, Lyft employed CompilerWorks Lineage to better understand what data is being used, by whom, for what, and how it was processed.

Since it was deployed in 2018,  CompilerWorks Lineage has become an integral part of the success of Lyft’s data scientists, engineers, and business users.

CompilerWorks Lineage use cases include:

  • Data Exploration
  • Data Quality
  • Pipeline Migration
  • Cost Control
  • Usage Tracking and Reporting
  • Onboarding New Data Analysts, Data Engineers, and Scientists.

CompilerWorks Lineage and Lyft Amundsen combined enable users to: 

  • Deliver data lineage transparency and literacy
  • Enable cost-effective, confident data migrations
  • Reduce risk posed by corrupt or inaccurate data resources
  • Optimize compute resource utilization, savings millions
  • Improve workflow productivity at every level
  • Ensure data accuracy

To learn more about how Lyft is using CompilerWorks Lineage to increase data transparency, accuracy, cost efficiency, and productivity, read the full customer success story here.

MLB’s fan data team hits it out of the park with Teradata to BigQuery modernization

Read a comprehensive “blow by blow” description of Major League Baseball’s platform modernization project to migrate their EDW from Teradata to Google Cloud’s BigQuery.

A major step in the process is migrating ETL scripts from Teradata SQL to BigQuery SQL, and quoting from Rob Goretsky, VP Data engineering MLB, “The SQL transpiler from CompilerWorks was helpful, as it dealt with the rote translation of SQL from one dialect to another.

Read the full article by Rob here.

Migrating Teradata to BigQuery – Out with the old, in with the new

You are ready to migrate your data from Teradata to BigQuery as quickly and efficiently as possible. 

You’re looking for a solution that doesn’t require you to manually migrate code, risking human error that can slow down your migration. 

In this guide, we discuss:

  • Why you shouldn’t migrate manually
  • How CompilerWorks offers simple solutions to your migration needs

Keep reading to learn more about data migration from Teradata to BigQuery and what your business can do to speed up the process.

Table of Contents

  • Teradata To BigQuery Migration – The Manual Way 
  • Potential Problems With Manual Migration
  • What is CompilerWorks?
  • CompilerWorks Platform Migration Benefits
  • CompilerWorks’ Best Migration Practices
  • Simplify Your Teradata TO BigQuery Platform Migration With CompilerWorks

Teradata To BigQuery Migration – The Manual Way 

83% of all data migrations fail to meet an organization’s expectations or fail completely. This is usually because when an organization or business starts a code migration, they are not aware of the fundamentals that make up a code migration. 

Although most migrations involve manual code migration as the most common code migration method — it is also the most time consuming and the most likely to have errors.

Converting Teradata code involves reading the code, understanding what it is doing, and manually converting it.  

Migrations can be complex and can become multi-year projects. 

If you are going to migrate from Teradata to BigQuery manually, there are a number of steps to take. Google refers to this as it’s migration framework, which involves:

  • Preparation
  • Planning
  • Migration
  • Verification and Validation

Preparation

Prepare for your migration — conduct an analysis, ask questions like: 

  • What are your cases for BigQuery? 
  • What databases are being migrated? — What can be migrated with little effort?
  • Which users and applications have access to these databases?
  • How is the data being used?

Planning

Start planning your migration by:

  • Assessing the current state
  • Create a backlog
  • Prioritizing cases 
  • Define your measures of success
  • Define done
  • Design a proof of concept (POC) 
  • Estimate time and costs of migration

Migration

It’s important to keep in mind that BigQuery and Teradata have different data types so conversions may be needed. 

Manually converting code is a tedious and difficult process that leaves a lot of room for human errors.

Then, you’ll perform an offload migration or a full migration

Verification and Validation

After converting the data, you have to test all the codes to make sure everything is working properly. Teradata migrations involve testing millions of lines of code in order to ensure that everything is running correctly.

Potential Problems With Manual Migration

Manual migration isn’t an easy task — a number of problems can arise. 

Teradata is one of the most complex systems on the market. Substantial amounts of code need to be added, read, and understood in order to work around the lack of syntax when doing a manual migration.

During this process, human error is inevitable. Code error can delay your migration project for weeks or even months.

Instead, Compilerworks eliminates human error by relying on smart technology to provide the same accurate results every time.

What is CompilerWorks?

CompilerWorks has developed a powerful solution that accelerates migration to the cloud. This solution covers: 

  • Structuring of the migration project
  • Automatic and accurate SQL code migration
  • Automated testing and verification

This technological solution involves two core applications: 

  1. The Transpiler Solution: This aids in the migration of SQL code between platforms.
  2. The Lineage Solution: This provides detailed insights concerning how data is used across an enterprise, including by who, for what, and at what cost.


CompilerWorks’ Core Technology

CompilerWorks’ core technology ingests source code and converts into Algebraic Representation, which will mathematically represent what the ingested code does.

Traditional compilers only work when given the complete code and full description of the execution environment. However, it’s impossible to meet these requirements in the realm of data processing code. 

In order to overcome this obstacle, Compilerworks’ software makes the same intelligent inferences that a human would and then reports these deductions to the user. 

Additionally, Compilerworks’ compilers can emit code in a high-level language (Transpiler solution) and in the lineage fabric (Lineage solution) which represents all actions of an entire code base.

CompilerWorks’ Supporting Infrastructure

In the real world, code rarely exists as simple .sql files. 

Database code is typically wrapped in scripts, B reports, and ETL tools. CompilerWorks provides the tools to extract the SQL code from various wrappers and then transpile and re-wraps it so that it is ready for execution and testing immediately. 

In the transpiler solution, there are hundreds of transformers embedded, including platform-specific optimization transpilers.

The lineage fabric takes advantage of the wealth of information captured by delivering global static analysis of data processing activities and providing GUI, CLI,  Graph QL, and API interfaces. The seamless integration of the CompilerWorks’ core technology and infrastructure combined represent the Transpiler Solution, which delivers fast, accurate, and predictable migration between data processing platforms.

CompilerWorks’ Platform Migration Benefits

Manual code migration is one giant mess waiting to happen. Human error is almost inescapable. 

With Compilerworks, software scopes out the entire project at the beginning of the migration process by automatically creating a comprehensive data lineage of source systems. This makes it possible for the system to automatically identify gaps in the source code to avoid project delays that can last up to months.

This automated process using the CompilerWorks Transpiler has three key benefits: 

  1. Accuracy
  2. Predictability
  3. Speed

Accuracy 

With manual migration, a series of rules are followed to rewrite a query. To ensure the query will run on the target platform, an execution test is performed. This traditional approach is prone to error. 

To be crystal clear: manually rewriting code can always lead to errors that go undetected by basic testing strategies. 

Instead of this approach, the Transpiler is designed to produce the same answer on both the source and target systems. 

Unlike human-driven conversions that can provide unpredictable results, the Transpiler provides accuracy by giving you the same correct answer, every time.

Predictability

With the CompilerWorks’ Transpiler, you can expect a predictable end-to-end solution for managing and executing platform migration projects.

Code migration projects must be: 

  • Located
  • Extracted
  • Converted (applying code transformations)
  • Tested and validated

Through processing the execution logs from the source system, the Transpiler systemically and immediately identifies: 

  • Code that is missing from the source provided for the migration project
  • Functionalities on the source system that need to be replicated on the target system
  • Any gaps in functionality in the target system that will need human intervention to migrate

The result? 

  • No more surprises in the migration project. 
  • No re-scoping because a new functionality/code is found.
  • No delays caused by missing functionality in the target system that was discovered half-way through the migration project. 

Beyond the predictability created by transpiling all of the code in the planning stage of the migration project, the lineage model provides a roadmap for structuring the migration project.

CompilerWorks offers the ability to strategically plan where you want to start your migration project and then provides guidance to order the migration in the most efficient and expeditious way possible. 

Speed

The Transpiler delivers performant and accurate code at lightning speeds. 

CompilerWorks can reduce the time spent on a migration project by 50% or more.

This is because the compiler has an understanding of all the nuances of the code being converted and the capabilities of the platform that it is generating code for. This information is used to generate performant code for the target platform.

As a testament to the Transpiler’s speed, CompilerWorks’ largest customer compiles 10TB of SQL on a single machine, on a daily basis. 


CompilerWorks’ Best Migration Practices

CompilerWorks’ Transpiler solution offers four key migration best practices: 

  • Structured migration
  • Iterative process
  • Integrated testing
  • Security review

Structured Migration

The CompilerWorks lineage fabric guides the entire migration project. 

Instead of manually reviewing the code to try to understand discrepancies between queries, relations, and attributes, CompilerWorks automates the process and provides a rich user interface to plan the migration project. 

If you are working on a “lift, improve, and shift” migration, the lineage model will immediately show you where you can wipe out unused processing and data, while also directing you to modifications in the data processing landscape that make the most logical sense.

 If you are working on a “redesign, re-architect, and consolidate” migration, the lineage model will provide the information (from across multiple source systems) to drive the entire migration project, which is made possible by the Transpiler itself.

An ideal approach to “lift and shift” migration involves these eight steps: 

  1. Select a key management report that you wish to migrate
  2. Discover all immediate upstream requirements by reviewing the lineage
  3. Transpile the upstream table DDL on the target system.
  4. Execute the translated DDL on the target system.
  5. Copy the required data.
  6. Execute the transpiled DML.
  7. Execute the provided verification queries.
  8. Use the lineage model to guide the next level of migration (loop back to step 1).

Iterative Process

To deliver a complete migration solution, CompilerWorks leverages the core capabilities of the transpiler.

This solution enables the testing of multiple migration strategies and selects the best approach for the particular migration project involved.

The iterative process works in five steps: 

  1. Assemble all inputs.
  2. Configure the transpiler as desired.
  3. Execute the transpiler.
  4. Inspect the outputs.
  • If missing inputs are discovered, loop back to step 1.
  • If the transpiler configuration needs tuning, loop back to step 2.
  1. Copy the required data.

This fast cycle in the iterative process enables experimentation so you can compare/test the code in order to best meet your requirements.

Integrated Testing

In integrative testing, the transpiler generates a comprehensive suite of test queries to validate DML and DDML migration. 

Integrated testing works in four steps: 

  1. Create the table on the target system.
  2. Compare SourceReadDQ to TargetReadDQ.
  3. Execute the pipeline on the source and target system.s.
  4. Compare SourceWriteDQ to TargetWriteDQ.

To facilitate automation of test query execution on both the source and target systems, the test queries are compiled in a machine-readable file. Correct migration is confirmed by the verified execution of the test query suite. 

Security Review

With Compilerworks, security reviews are a breeze. All of CompilerWorks’ software is designed with security in mind as a top priority:

  • CompilerWorks never touches data. It only processes code.
  • CompilerWorks is a standalone package that can run on an air-gapped machine.
  • CompilerWorks generates clean logs— values are obfuscated.
  • CompilerWorks has frequent updates.

CompilerWorks leaves zero footprint.

Simplify Your Teradata to BigQuery Platform Migration With CompilerWorks

The CompilerWorks Transpiler Solution is the logical choice for simplifying and ensuring the success of your platform migration.Turn your large, high risk, slow, manual migration from Teradata to BigQuery into a predictable, fast, accurate, and painless automated process with CompilerWorks.

CCPA vs GDPR: A Comparison Guide

You’ve been searching for ways to remain in compliance with the GDPR and the CCPA.

Understanding the difference between CCPA and GDPR can be complex. There is so much information to sift through and it’s overwhelming. 

How do you know if your business is compliant with one, both, or neither?

For businesses operating in California, it’s important to understand both and what they mean for your business. A simple “notice and choice” option for consumers is not enough to give consumers rights over their information.  

There are ways to check whether your company needs to comply with GDPR, CCPA, or both, and what this looks like. 

In this guide, we are going to explain how to tell which regulation applies to your business, how to learn if your business is compliant, and how we can help you achieve it.

Table of Contents

  • What are GDPR and CCPA?
  • GDPR
  • CCPA
  • Key Differences Between GDPR and CCPA
  • What CCPA and GDPR Compliance Guidelines Mean For Your Business
  • How CompilerWorks Can Help 
  • Enabling GDPR and CCPA Compliance With CompilerWorks


What are GDPR and CCPA?

The GDPR and CCPA are essential data privacy laws that affect businesses around the world. They both protect consumers’ privacy. 

This is great news for consumers. 

The bad news is, compliance with the General Data Protection Regulation (GDPR) does not guarantee compliance with the California Consumer Privacy Act (CCPA).

We’re going to briefly explain each and how this may affect you.

GDPR:

On May 25th, 2018, the European Union passed one of the toughest privacy and security laws in the world: The General Data Protection Regulation (GDPR). This law applies to anyone that targets or collects data related to people in the EU.

Let’s say that your enterprise tracks EU visitors to your website. You see that their IP address falls in EU territory.

You might want to know:

  • Their browsing activity
  • The kind of computer they’re using
  • Other accounts they’ve logged into
  • Information from tracking cookies etc. 

These companies are now under the legal scope of the GDPR. Many major U.S. based companies are affected by this.

Here’s how it’s supposed to work.

Rights transparency is central to the GDPR. 

The GDPR requires companies to inform consumers about types of data being collected about them, and why. Consumers had to agree to many updated terms of service by that deadline, May 25th. 

If they didn’t, they could no longer use that site. 

If a business doesn’t comply, the penalties can be steep: Up to 4% of your company’s global annual revenue or 20 million euros. Whichever is higher. 

There are some exceptions to the rule.

If you’re collecting email addresses and contact information to organize a birthday party, the GDPR will not apply to you. It applies to solely professional or commercial activity. 

There are some limits to these exceptions. 

CCPA:  

In 2016, former California Attorney General, Kamala Harris, released a report detailing a data breach that affected about 49 million California residents.

This shined a spotlight on the need for greater security on the web. And with an economy bigger than the UK, California needed their own solution.   

The 2020s have become the decade where the U.S. really gets serious about data security. The California Consumer Privacy Act (CCPA) came into effect on January 1st, 2020. Enforcement began July 1st, 2020.

The CCPA gives consumers more control over the personal information that businesses collect about them.

In preparation, you might have begun to get your house in order long ago. So who does it apply to?

It applies to any company that meets all of these requirements:

  • The company operates within California  
  • It makes at least $25 million in revenue 
  • Or whose primary business is the sale of personal information

Here’s a simple list of CCPA consumer rights. Consumers have the right to:

  • Information about how their personal data is processed
  • Opt-out of the sale of personal information 
  • To delete personal information
  • Non-discrimination for exercising these consumer rights
  • To direct private right of action for certain data breaches
  • For Minors: they have the right to opt-IN to the sale of their personal information

What happens if you violate the CCPA? 

California Attorney General Xavier Becerra told Reuters in 2019: “If they are not (operating properly) … I will descend on them and make an example of them, to show that if you don’t do it the right way, this is what is going to happen to you.”

The hammer will come down.

Key Differences Between GDPR and CCPA:

The GDPR and CCPA often use different definitions, scopes, and exceptions to their regulations. For example, the CCPA defines “personal data” more broadly and includes data about devices. The GDPR focuses on specific individuals and is less process-oriented than the CCPA. 

The CCPA requires a different scope of privacy disclosures than the GDPR. 

According to the GDPR, “personal data” is defined broadly to mean “any information relating to an identified or identifiable person”. This includes things like:

  • Cookies
  • IP addresses
  • Device IDs etc.

Under the CCPA, “personal data” is expanded to data associated with a household.

Adhering to the GDPR may not allow your company to be compliant with CCPA. Keep reading to look at some primary CCPA and GDPR differences.

Data Collection Practices

The CCPA and the GDPR are constantly changing and adapting to new technologies. As a result, the specific measures businesses need to take are unfortunately vague.

As you collect personal data, the GDPR requires consumer rights disclosure that covers such things as:

  • Transparency
  • Purpose limitation
  • Collecting on the minimal amount of data necessary 
  • Collecting accurate data
  • Encrypt or pseudonymize data where possible

The CCPA has more specific consumer rights regarding the collection and sale of their data:

  • Consumers must be informed of what categories of information are being collected. This can include IP addresses, internet activity, geolocation data, education information, and more. 
  • Consumers must be notified about why this information is being collected. How will this information be used?
  • The right to request the deletion of this information must be disclosed, as well as the limitations to these rights. 
  • Are there any additional categories of data that are being collected? Any additional purposes this data can be used for? The consumer must also be notified of this. 

Additional disclosures need to be made if the information is being sold or disclosed for business purposes.  

Enforcement and Nondiscrimination Practices 

In order to compare GDPR and CCPA, it’s important to look at how infractions are assessed.

The GDPR looks at global revenue. These fines reach up to 2% – 4% depending on the nature of the infraction. This can mean huge numbers especially for some well-known companies in Silicon Valley. 

The CCPA looks at how many consumers are affected. For civil penalties, the California Attorney General may require $2,500 per violation. Intent matters. This can be up to $7,500 if the violation is intentional. 

You can take a deep breath. There is a 30 day cure period for violations with given notice.   

The CCPA and GDPA provide consumers with a “right to non-discrimination”. Under both, a business must not use collected information to discriminate against a consumer. 

What CCPA and GDPR Compliance Guidelines Mean For Your Business

In order to protect your business, it’s beneficial to compare CCPA vs GDPR compliance and find out what consumer rights apply.  

What does this mean for businesses in California?  

Non-compliance comes with some steep costs. Happily, there are no specific encryption strengths or technologies you need to be compliant with. 

GDPR Compliance 

For data collection compliance, the GDPR has 6 criteria that must be met:

  1. The data must be collected lawfully, fairly, and in a transparent manner.
  2. It must be collected for a legitimate reason and with limited purposes.
  3. It must be adequate, limited to what is necessary and relevant.
  4. Data must be accurate and kept up to date where necessary.
  5. The data must be kept in an identifiable form no longer than necessary.
  6. Data must be processed securely. 

There are checklists upon checklists to remain compliant with every section of the GDPR. Combing through acquired data from consumers, manually looking for the purpose of each line of code, is a tedious process.

CCPA Compliance 

Here are some key suggestions in order to align with CCPA regulations:

  • You’ll need to know where your data is. You can use the cloud environment or data warehouse to manage this.
  • Encrypt or redact your data. 
  • If you’re selling personal information, be sure to track, and respond to, opt-in and opt-out requests.    
  • Offer two ways for a consumer to opt-out of the sale of their data.

Another key way to remain in compliance is to have a robust data inventory. You need to know why you have that data and who should have access to it. This requires data mapping— which is the process of creating data element mappings between two distinct data models.

Data mapping can help you with processes like:

  • Data migration— the process of moving data from one application to another.
  • Data integration— the process of combining data from different sources into a single, unified view.
  • Data transformation— the process of converting data from one data structure to another.
  • Data warehousing— the process of constructing and utilizing a data warehouse.

How CompilerWorks Can Help  

An ideal compliance solution must empower a data protection officer to: 

  • Have the ability to identify PII wherever it is in the organization’s data infrastructure
  • Highlight wherever PII is used for analysis
  • Have the ability to enable the destruction of PII for any selected individual across the entire organization

These requirements should not be restricted to individual departments or certain data processing repositories, but impact cross-functional areas in the entire organization. 

To solve challenges imposed by compliance, a DPO must be enabled to: 

  • Track processing and data movement across organizational and technological boundaries
  • Audit data processing and access
  • Complete comprehensive analyses of data flow

CompilerWorks offers the ideal solution to compliance challenges by enabling the platform to deliver  compliance utilizing the lineage fabric and CompilerWorks lineage solution built around it. 

Enabling GDPR and CCPA Compliance With CompilerWorks

The lineage fabric developed by CompilerWorks is generated with a standard process regardless of the application area. 

PII can be identified anywhere in an organization’s data storage and processing center to deliver compliance. This allows the DPO to directly identify PII and allows others across the organization to tag PII. 

Automatically, the lineage model tracks the preservation of the PII across the data infrastructure. The DPO can then track PII enterprise-wide consistently with particular enterprise policies.

By integrating the identification of PII with the lineage model, automated analyses can be enabled, such as: 

  • Tracking PII data movement at the column row level: 
    • Data copying
    • Agregation
    • PII “leakage”
  • Audit of data access— which allows for specific, time-stamped identification of which users/ systems view each piece of PII.
  • Destruction of PII from the data source throughout the entire data infrastructure.

This allows for the DPO to not only demonstrate compliance to management and the authorities but also control the data access and usage across the entire organization and processing infrastructure. 

With the CompilerWorks lineage model, compliance is simplified.

Redshift to BigQuery Migration with CompilerWorks

You are considering migrating from Redshift to BigQuery and want your code migration to go as smoothly as possible.  

Compilerworks can help make your Redshift to BigQuery migration fast, predictable, and accurate.

In this guide, we will talk about Redshift to BigQuery migration challenges and how CompilerWorks can help. 

Table of Contents

  • Migrate Redshift to Bigquery – The Manual Way
  • Redshift to BigQuery Migration Challenges
  • What is CompilerWorks?
  • How CompilerWorks’ Platform Migration Benefits
  • CompilerWorks’ Best Migration Practices
  • Simplify Your Migration From Redshift to BigQuery with CompilerWorks

Migrate Redshift To BigQuery – The Manual Way 

Is There An Easier Way To Migrate Data From SQL Server To Snowflake

An analysis report conducted by Gartner revealed that 83% of platform migrations either fail to meet an organization’s expectations or fail completely. Oftentimes, this failure occurs because an organization does not fully understand the fundamentals of manual migration. Additionally, manual migration can be a lengthy, error prone process due to inescapable human error. This can lead to delays and multi-year projects.

Manually migrating code involves: 

  • Understanding the landscape of the code
  • Manually converting the code
  • Testing the code on copied data

In order to manually migrate from Redshift to Bigquery, there are a number of steps that you must take: 

  • Preparation
  • Planning
  • Migration
  • Verification and Validation

Preparation

In order to prepare for your manual migration, it is imperative to conduct an analysis by asking yourself questions like:

  • Why are you migrating to BigQuery?
  • Which databases are going to be migrated? What can be migrated with the least amount of effort?
  • Who and what have access to these databases?
  • How will the data be used?

Planning

Begin planning your manual migration by:

  • Assess the current state
  • Create a backlog
  • Prioritize cases 
  • Define your measures of success
  • Define done
  • Design a proof of concept (POC) 
  • Estimate the time and costs of migration

Migration

With Redshift and BigQuery having different data types, conversions may be required. Manually converting code is a complex, tedious process that is inevitable to human error. Delays are almost to be expected.Once the code is successfully converted, you’ll perform an offload migration or a full migration.

Verification and Validation

After the data is converted, it is imperative to test all of the codes to make sure there are no mistakes. This process can be lengthy as millions of lines of code may need to be tested, but it is essential to find and fix errors as soon as possible to avoid longer project delays.Using a manual process inevitably leads to problems that are discovered after the switch to BigQuery. This extended issue discovery and the requirement to fix is rarely considered when deciding to migrate manually.

Redshift to BigQuery Migration Challenges

Manual migration is a complex task that can lead to frustrating challenges and delays.

A large amount of code needs to be added, read, and understood when completing a manual migration. 

Even with a full understanding of the manual migration process, human error can be an unavoidable challenge in this process, leading to weeks, even months of delays. 

Humans get tired, machines don’t. 

CompilerWorks eliminates migration challenges caused by human errors by relying on smart technology to provide the same, accurate results every time.

What is CompilerWorks?

CompilerWorks can help accelerate your migration to the cloud with a robust solution that covers:

  • Structuring the migration project
  • Automatic and accurate SQL code migration
  • Automated testing and verification

This solution is comprised of two core applications: 

  1. The Transpiler Solution— which aids in the migration of SQL code between platforms.

The Lineage Solution— which provides detailed insights concerning how data is used across an enterprise, which answers by who, for what, and at what cost.

CompilerWorks’ Core Technology

The CompilerWorks’ core technology works by ingesting source code and converting it into Algebraic Representation to mathematically represent the actions of the ingested code.

Traditionally, compilers only work when they are given the complete code and full description of the execution environment. However, in the realm of data processing code, this is a nearly impossible feat.

CompilerWorks’ software is able to overcome this obstacle with its ability to make the same intelligent inferences as a human and report deductions to the user.

CompilerWorks’ compilers can also emit code as either a high-level language (Transpiler solution) or in the lineage fabric (Lineage solution).

CompilerWorks’ Supporting Infrastructure

Code rarely exists as simple .sql files.

Typically, database code is wrapped in: 

  • Scripts
  • B reports
  • ETL tools

With CompilerWorks, all of the necessary tools to extract the SQL code from these various wrappers are provided. The code is transpiled and re-wrapped for immediate execution and testing.

Hundreds of transformers are embedded, including platform-specific optimization transpilers, in the transpiler solution.

The lineage fabric makes use of information captured by:

  • Delivering global static analysis of data processing activities
  • Providing GUI, CLI, Graph QL, and API interfaces

The CompilerWorks core technology and infrastructure work seamlessly together to represent the Transpiler Solution— which results in a fast, accurate, and predictable platform migration.

CompilerWorks’ Platform Migration Benefits

CompilerWorks’ Transpiler Solution is designed for accurate and automated SQL code migration between platforms. 

At the beginning of a migration project, CompilerWorks software scopes the entire project by creating a comprehensive data lineage of source systems automatically. Any gaps in the source code are identified automatically to avoid any project delays, which typically last up for months. 

The three key benefits of using the CompilerWorks Transpiler to automate this process are: 

  • Accuracy
  • Predictability
  • Speed

Accuracy

The traditional approach to code migration typically requires following a set of rules to rewrite a query and performing an execution test to ensure the query will run on the target platform. This approach is highly prone to error, as mistakes can slip through basic testing strategies. 

CompilerWorks Transpiler is designed to automatically produce the same, accurate answer on both the source and target system every time.

Predictability

When it comes to managing and executing code migration projects, the CompilerWorks’ Transpiler offers a predictable, end-to-end solution.

CompilerWorks processes execution logs from the source system to immediately identify:    

  • Code that is missing in the source system
  • All functionality on the source system that needs to be replicated on the target system
  • Any gaps in functionality in the target system that will require human intervention for migration [further review and assessment]

The result is: 

  • No more surprises during migration projects.
  • No re-scoping because a new code or functionality is discovered.
  • No more delays due to missing functionality in the target system that was only discovered halfway into the migration project.
  • No surprises AFTER the switch to BigQuery is complete

Speed

Transpiler can generate performant and accurate code at rapid speed. 

This is due to the compiler having the abilities to understand all the nuances and capabilities of the code being converted and the platform that it is generating code for. With this information, the performant code is delivered to the target platform.

The CompilerWorks’ Transpiler solution can reduce time spent on code migration projects by 50% or more.

CompilerWorks’ Best Migration Practices

There are four key best migration practices involving the CompilerWorks Transpiler Solution, including: 

  • Structured migration
  • Iterative process
  • Integrated testing
  • Security review

Structured Migration

The entire migration project is guided by the CompilerWorks lineage fabric.

Compilerworks eliminates the need to manually review code. Instead, ComplierWorks automates this process and provides a rich user interface for planning the migration.

If you are working on a “lift, improve, and shift” migration, the lineage model can immediately show you where you can wipe out unused processing and data. This is completed while also directing you to modifications in the data processing landscape that make the most logical sense.

 If you are working on a “redesign, re-architect, and consolidate” migration, the lineage model will provide the necessary information pulled from multiple source systems to drive the entirety of the migration project.

The most ideal approach to “lift and shift” migration involves following these eight steps: 

  1. Select a key management report that you want to migrate.
  2. Discover all immediate upstream requirements by reviewing the lineage.
  3. Transpile the upstream table DDL on the target system.
  4. Execute the translated DDL on the target system.
  5. Copy the required data.
  6. Execute the transpiled DML.
  7. Execute the provided verification queries.
  8. Use the lineage model to guide the next level of migration (loop back to step 1).

Iterative Process

CompilerWorks leverages the core capabilities of the transpiler to deliver a complete migration solution by enabling the testing of multiple migration strategies and selecting the best approach for the particular migration project involved.

The iterative process workflow is completed in five steps: 

  1. Assemble all inputs.
  2. Configure the transpiler.
  3. Execute the transpiler.
  4. Inspect the outputs.
  • Loop back to step 1 if missing inputs are found.
  • Loop back to step 2 if the transpiler configuration needs tuning.
  1. Copy the required data.

Because this cycle is a rapid process, you can experiment with comparing and testing the code in order to best meet requirements.

Integrated Testing

The transpiler generates a thorough suite of test queries to validate DML and DDML migration in integrative testing.

Integrated testing is completed in four steps: 

  1. Produce the table on the target system.
  2. Compare SourceReadDQ to TargetReadDQ.
  3. Execute the pipeline on the source and target systems.
  4. Compare SourceWriteDQ to TargetWriteDQ.

The test queries are assembled in a machine-readable file in order to facilitate the automation of test query execution on both the source and target systems. The verified execution of the test query suite will confirm that the migration is completed correctly. 

Security Review

Security reviews are easy with CompilerWorks. 

CompilerWorks’ software was designed with security as a top priority. You can have ease of mind that: 

  • CompilerWorks only processes code and never touches data. 
  • Compiler works can run on an air-gapped machine and is a standalone package. 
  • CompilerWorks creates clean logs— values are obfuscated.
  • CompilerWorks frequently updates.
  • CompilerWorks leaves zero footprint.

Simplify Your Migration From Redshift to BigQuery With CompilerWorks

The CompilerWorks Transpiler solution can simplify and eliminate delays in your migration from Redshift to Bigquery.

Say goodbye to traditionally large, high-risk, slow, and error-prone migration projects and step into the future with CompilerWorks’ Transpiler solution for a predictable, fast, and accurate migration.

SQL Server to Snowflake Migration with CompilerWorks

snowflake SQL migration

You want to migrate from SQL server to Snowflake without experiencing the major challenges of manual migration. 

CompilerWorks can help make your migration from SQL server to Snowflake fast, accurate, and predictable. 

In this guide, we will discuss: 

  • SQL server to Snowflake migration challenges
  • How CompilerWorks can simplify your SQL server to Snowflake migration

Table of Contents

  • Migrating SQL Server To Snowflake
  • Why Would You Want To Move Data from SQL Server to Snowflake?
  • Limitations and Challenges When Moving SQL Server To Snowflake
  • How CompilerWorks Can Help
  • Simplify Your Migration From a SQL Server to Snowflake With CompilerWorks

SQL Server to Snowflake: Potential Issues With Manual Migration

Is There An Easier Way To Migrate Data From SQL Server To Snowflake

Using Snowflake makes data storage easy, but making the transfer is a different story.

Most of your SQL Server to Snowflake migration options are time-consuming and can be risky. 

There is an easier way.

Using a specialized data migration service, like CompilerWorks, reduces the chance for human error and makes the transfer much faster.

That means less downtime for you and a much higher rate of accuracy.

How CompilerWorks Can Help

CompilerWorks’ Transpiler solution is a robust solution to accelerate migration to the cloud. 

The Transpiler solution covers: 

  • Automatic, accurate SQL code migration
  • Structuring of the migration project
  • Automated testing and verification

The CompilerWorks’ Transpiler solution works to avoid lengthy delays by:

  1. Scoping the entire project by automatically creating a comprehensive data lineage of source systems at the start of the migration project. 
  2. Automatically identifying gaps in the source code that can cause delays lasting up to months.

With the CompilerWorks’ Transpiler solution you can rely on its accuracy, predictability, and speed to simplify your SQL server to Snowflake migration. 

Accuracy 

Unlike the traditional process of manually rewriting code, which can lead to errors that slip through basic testing strategies and cause massive delays, the transpiler is designed to produce the same, accurate answer on the source and target systems every time. 

Predictability

The CompilerWorks’ Transpiler solution eliminates surprises during the migration project. 

The Transpiler guarantees: 

  • No need to rescope because new code/functionality is discovered.
  • No delays introduced due to missing functionality in the target system that wasn’t discovered until halfway through the migration project. 

This is because CompilerWorks processes the execution logs and systemically identifies: 

  • Any missing code in the source system provided for the migration project
  • All functionality on the source system that needs to be replicated onto the target system
  • Any gaps in functionality found in the target system that will require human intervention for the project migration.

The CompilerWorks’ Transpiler solution is a predictable, end-to-end solution for the management and execution of migration projects.

Speed

The transpiler quickly generates performant and accurate code. 

This is due to the compiler understanding all the nuances of the code being converted and the capabilities of the platform it is being generated for. The compiler uses this information to create performant code for the target platform. 

With the CompilerWorks’ Transpiler solution, time spent on migration projects can be reduced by 50% or more.

For example, CompilerWorks’ largest customer compiles 10TB of SQL on a daily basis using a single machine. 

Simplify Your Migration From a SQL Server to Snowflake With CompilerWorks

The CompilerWorks Transpiler solution takes the stress and arduous work out of your SQL server to Snowflake migration project. 

With CompilerWorks’ revolutionary core technology, the Transpiler Solution can turn your large, high-risk, slow, error prone migration into a predictable, structured, fast, and accurate one.