Skip to content

YHordiichuk/DeequTestTask

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deequ Test Task

Data Quality Engineering - Deequ test task

Steps fot setup

  1. Setup environment. Add jdbc driver to folder with jars(spark_version=3.0.0)
  2. Create new login in MS SQL SMS. Save username, password, servername and name of DB. DdCredintalsHint
  3. Update values in project from previous step: change_values_hint
  4. Install pydequ via pip on docker container.
  5. Deequ_pySpark_skeleton.ipynb in docker container.

Report example

It's preview: reportexample

You can look at whole report by open this file.

If not work and problem like from db side:

  1. Check settings in SQL Server Configuration Manager -> SQL Server Network Configuration -> Protocols for -> TCP/IP Properties -> IP Addresses
  2. At least one ip address may be active and enabled

About

Docker -> Spark -> Jupyter -> Pydeequ

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published