Skip to content

PacktPublishing/The-Definitive-Guide-to-Data-Integration

Repository files navigation

The Definitive Guide to Data Integration

Organization and resources

This is the code repository for The Definitive Guide to Data Integration, published by Packt.

Unlock the power of data integration to efficiently manage, transform, and analyze data

What is this book about?

The Definitive Guide to Data Integration is an indispensable resource for navigating the complexities of modern data integration. Focusing on the latest tools, techniques, and best practices, this guide helps you master data integration and unleash the full potential of your data.

This book covers the following exciting features:

  • Discover the evolving architecture and technologies shaping data integration
  • Process large data volumes efficiently with data warehousing
  • Tackle the complexities of integrating large datasets from diverse sources
  • Harness the power of data warehousing for efficient data storage and processing
  • Design and optimize effective data integration solutions
  • Explore data governance principles and compliance requirements

If you feel this book is for you, get your copy today!

https://www.packtpub.com/

Instructions and Navigations

The code will look like the following:

# Filter employees with salary greater than $50,00
filtered_employees_df = employees_df.filter(employees_df.salary > 50000)

Following is what you need for this book: This book is perfect for data engineers, data architects, data analysts, and IT professionals looking to gain a comprehensive understanding of data integration in the modern era. Whether you’re a beginner or an experienced professional enhancing your knowledge of the modern data stack, this definitive guide will help you navigate the data integration landscape.

Following are the software and hardware list present in the book (Chapter 1-16).

Software and Hardware List

Chapter Software required OS required
1-16 SQL and data transformation Windows, macOS, or Linux
1-16 Massively parallel processing systems Windows, macOS, or Linux
1-16 Spark for data transformation Windows, macOS, or Linux
1-16 Data storage technologies (data warehouses, data lakes, Windows, macOS, or Linux
1-16 Data modeling techniques Windows, macOS, or Linux
1-16 Data integration models (ETL and ELT) Windows, macOS, or Linux
1-16 Data exposition technologies (Streams, REST APIs, Windows, macOS, or Linux

Related products

Get to Know the Author

Pierre-Yves Bonnefoy is a versatile data and cloud architect boasting over 20 years of experience across diverse technical and functional domains. With an extensive background in software development, systems and networks, data analytics, and data science, Pierre-Yves offers a comprehensive view of information systems. As the CEO of Olexya and CTO of Africa4Data, he dedicates his effort to delivering cutting-edge solutions for clients and promoting data-driven decision-making. As an active board member of French Tech Le Mans, Pierre-Yves enthusiastically supports the local tech ecosystem, fostering entrepreneurship and innovation while sharing his expertise with the next generation of tech leaders. You can contact him at [email protected].

Emeric Chaize with over 16 years of experience in data management and cloud technology, demonstrates a profound knowledge of data platforms and their architecture, further exemplified by his role as president of Olexya, a data architecture company. His background in computer science and engineering, combined with hands-on experience, has honed his skills in understanding complex data architectures and implementing efficient data integration solutions. His work at various small and large companies has demonstrated his proficiency in implementing cloud-based data platforms and overseeing data-driven projects, making him highly suited for roles involving data platforms and data integration challenges. You can contact him at [email protected].

Raphaël Mansuy is a seasoned technology executive and entrepreneur with over 25 years of experience in software development, data engineering, and AI-driven solutions. As a founder of several companies, he has demonstrated success in designing and implementing mission-critical solutions for global enterprises, creating innovative technologies, and fostering business growth. Raphaël is highly skilled in AI, data engineering, DevOps, and cloud-native development, offering consultancy services to Fortune 500 companies and start-ups alike. He is passionate about enabling businesses to thrive using cutting-edge technologies and insights. You can contact him at [email protected].

Mehdi TAZI is a data and cloud architect with over 12 years of experience and the CEO of an IT consulting and investment company. He specializes in distributed information systems and data architecture. He navigates through both platform and application facets. Mehdi designs information systems architectures that answer customers’ needs by setting up technical, functional, and organizational solutions, as well as designing and coding in languages such as Java, Scala, or Python. You can contact him at [email protected]/tazimehdi.com.

About

B19415 - The Definitive Guide to Data Integration

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •