Skip to content

RANGER-5175: Functional Test Case Support for KMS API and HDFS Encryption #547

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 29 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
c3df1c9
RANGER-5175: Functional Test Case Support for KMS API and HDFS Encryp…
Mar 24, 2025
1ad3fb0
Update test-pytest.sh
ChinmayHegde24 Mar 24, 2025
18e90df
Update readme.md
ChinmayHegde24 Mar 24, 2025
52fffc0
Update readme.md
ChinmayHegde24 Mar 24, 2025
9aa0c62
Update readme.md
ChinmayHegde24 Mar 26, 2025
3434af7
Update readme.md
ChinmayHegde24 Mar 26, 2025
cd96c97
Updated directory structure
ChinmayHegde24 Apr 2, 2025
5765da2
Updated directory structure
ChinmayHegde24 Apr 2, 2025
ca8eea2
Add files via upload
ChinmayHegde24 Apr 2, 2025
2ee818d
Update test-pytest.sh
ChinmayHegde24 Apr 2, 2025
09fe18d
Update readme.md
ChinmayHegde24 Apr 2, 2025
a2237d1
Refactored test_encryption.py
ChinmayHegde24 Apr 2, 2025
9c5e16c
added run_command and get_error_logs in utils.py
ChinmayHegde24 Apr 2, 2025
41817ea
Added test_config and conftest files
ChinmayHegde24 Apr 2, 2025
f8f127c
RANGER-5175:Updated test-pytest.sh
ChinmayHegde24 Apr 2, 2025
c4d917b
RANGER-5175: Refactored test_encryption.py
ChinmayHegde24 Apr 2, 2025
73c08ec
RANGER-5175: Made test-pytest.sh script file to be dynamic
ChinmayHegde24 Apr 10, 2025
daea26b
RANGER-5175:Updated Readme.md
ChinmayHegde24 May 6, 2025
fff53dd
RANGER-5175: Duplicate key creation test case added
ChinmayHegde24 May 6, 2025
0f37ea8
RANGER-5175:Update test_keyOps.py
ChinmayHegde24 May 6, 2025
199f27b
RANGER-5175:Update test_keyDetails.py
ChinmayHegde24 May 6, 2025
a02cdd4
RANGER-5175: Updated test_kms/readme.md
ChinmayHegde24 May 6, 2025
803e87e
RANGER-5175: Added extra 2 test files
ChinmayHegde24 May 6, 2025
67e70d3
RANGER-5175:Update test_hdfs/readme.md
ChinmayHegde24 May 6, 2025
ece84a1
RANGER-5175: Parametrised command templates
ChinmayHegde24 May 6, 2025
1fda7bf
RANGER-5175: Added polling function to handle container restart
ChinmayHegde24 May 6, 2025
14ab32a
RANGER-5175:made run_command method to return exit_code as well
ChinmayHegde24 May 6, 2025
79136fc
RANGER-5175: Dynamic test command execution using parameterized templ…
ChinmayHegde24 May 6, 2025
5b3db7e
RANGER-5175: Added two new test files to test_hdfs
ChinmayHegde24 May 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions PyTest-KMS-HDFS/pytest.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[pytest]
markers =
cleanEZ: clean up the encryption zone
createEZ: create encryption zone
66 changes: 66 additions & 0 deletions PyTest-KMS-HDFS/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# KMS API & HDFS Encryption Pytest Suite


This test suite validates REST API endpoints for KMS (Key Management Service) and tests HDFS encryption functionalities including key management and file operations within encryption zones.

**test_kms :** contains test cases for checking KMS API functionality

**test_hdfs :** contains test cases for checking KMS functionality through hdfs encryption lifecycle

## 📂 Directory Structure

```
test_directory/
├── test_kms/ # Tests on KMS API
├── test_keys.py # Key creation and key name validation
├── test_keys_02.py # Extra test cases on key operation
├── test_keyDetails.py # getKeyName, getKeyMetadata, getKeyVersion checks
├── test_keyOps.py # Key operations: Roll-over, generate DEK, Decrypt EDEK
├── test_keyOps_policy.py # validate key operation based on policy enforcement
├── conftest.py # Reusable fixtures and setup
├── utils.py # Utility methods
├── readme.md
├── test_hdfs/ # Tests on HDFS encryption cycle
├── test_encryption.py # test file 1
├── test_encryption02.py # test file 2
├── test_encryption03.py # test file 3
├── test_config.py # stores all constants and HDFS commands
├── conftest.py # sets up the environment
├── readme.md
├── utils.py # Utility methods

├── pytest.ini # Registers custom pytest markers
├── requirements.txt
├── README.md # This file
```

## ⚙️ Setup Instructions
Bring up KMS container and any dependent containers using Docker.

Create a virtual environment and install the necessary packages through requirements.txt

## Run test cases

**Navigate to PyTest-KMS-HDFS directory**

**to run tests in test_kms folder**
> pytest -vs test_kms/

to run with report included
> pytest -vs test_kms/ --html=kms-report.html


**to run tests in test_hdfs folder**

> pytest -vs -k "test_encryption"
or
>pytest -vs test_hdfs/

to run with report included
>pytest -vs test_hdfs/ --html=hdfs-report.html

📌 Notes

Ensure Docker containers for KMS and HDFS are running before executing tests.

Reports generated using --html can be viewed in any browser for detailed test results.
20 changes: 20 additions & 0 deletions PyTest-KMS-HDFS/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
annotated-types==0.7.0
certifi==2025.1.31
charset-normalizer==3.4.1
docker==7.1.0
idna==3.10
iniconfig==2.0.0
Jinja2==3.1.6
MarkupSafe==3.0.2
packaging==24.2
pluggy==1.5.0
pydantic==2.11.0
pydantic_core==2.33.0
pytest==8.3.5
pytest-html==4.1.1
pytest-metadata==3.1.1
python-on-whales==0.76.1
requests==2.32.3
typing-inspection==0.4.0
typing_extensions==4.13.0
urllib3==2.3.0
103 changes: 103 additions & 0 deletions PyTest-KMS-HDFS/test_hdfs/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
import docker
import pytest
import time
from test_config import (HADOOP_CONTAINER, HDFS_USER,KMS_PROPERTY,CORE_SITE_XML_PATH,SET_PATH_CMD)

# Setup Docker Client
client = docker.from_env()

@pytest.fixture(scope="module")
def hadoop_container():
container = client.containers.get(HADOOP_CONTAINER) #to get hadoop container instance
return container

# polling method to wait until container gets restarted
def wait_for_hdfs(container, user='hdfs', timeout=30, interval=2):

print("Waiting for HDFS to become available...")
start_time = time.time()

while time.time() - start_time < timeout:
exit_code, _ = container.exec_run("hdfs dfs -ls /", user=user)
if exit_code == 0:
print("HDFS is ready.")
return True
else:
print("⏳ HDFS not ready yet, retrying...")
time.sleep(interval)

raise TimeoutError("HDFS did not become ready within the timeout period.")


def configure_kms_property(hadoop_container):
# Check if KMS property already exists
check_cmd = f"grep 'hadoop.security.key.provider.path' {CORE_SITE_XML_PATH}"
exit_code, _ = hadoop_container.exec_run(check_cmd, user='root')

if exit_code != 0:
# Insert KMS property
insert_cmd = f"sed -i '/<\\/configuration>/i {KMS_PROPERTY}' {CORE_SITE_XML_PATH}"
exit_code, output = hadoop_container.exec_run(insert_cmd, user='root')
print(f"KMS property inserted. Exit code: {exit_code}")

# Debug: Show updated file
cat_cmd = f"cat {CORE_SITE_XML_PATH}"
_, file_content = hadoop_container.exec_run(cat_cmd, user='root')
print("Updated core-site.xml:\n", file_content.decode())

# Restart the container to apply the config changes
print("Restarting Hadoop container to apply changes...")
hadoop_container.restart()
wait_for_hdfs(hadoop_container, user=HDFS_USER) # Wait for container to fully restart
# time.sleep(10)
print("Hadoop container restarted and ready.")

else:
print("KMS provider already present. No need to update config.")

# # Leave safe mode if active
# print("Exiting safe mode (if active)...")
# leave_safe_mode_cmd = "hdfs dfsadmin -safemode leave"
# exit_code, output = hadoop_container.exec_run(leave_safe_mode_cmd, user=HDFS_USER)
# print(output.decode()) # For debugging


def ensure_user_exists(hadoop_container, username):
# Ensure keyadmin user exists
print("Ensuring keyadmin user exists...")
user_check_cmd = f"id -u {username}"
exit_code, _ = hadoop_container.exec_run(user_check_cmd, user='root')

if exit_code != 0:
# Create the keyadmin user if not already present
create_user_cmd = f"useradd {username}"
exit_code, output = hadoop_container.exec_run(create_user_cmd, user='root')
print(f"keyadmin user created. Exit code: {exit_code}")

# Assign necessary permissions to the user
assign_permissions_cmd = f"usermod -aG hadoop {username}"
exit_code, output = hadoop_container.exec_run(assign_permissions_cmd, user='root')
print(f"Permissions assigned to keyadmin. Exit code: {exit_code}")
else:
print("keyadmin user already exists. No need to create.")



# Automatically setup environment before tests run
@pytest.fixture(scope="module", autouse=True)
def setup_environment(hadoop_container):

set_path_cmd = SET_PATH_CMD
hadoop_container.exec_run(set_path_cmd, user='root')

configure_kms_property(hadoop_container)
ensure_user_exists(hadoop_container,"keyadmin")

# Exit Safe Mode
print("Exiting HDFS Safe Mode...")
hadoop_container.exec_run("hdfs dfsadmin -safemode leave", user=HDFS_USER)

yield # Run tests

# Post-test cleanup
print("Tests completed.")
107 changes: 107 additions & 0 deletions PyTest-KMS-HDFS/test_hdfs/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# This is the main directory for testing HDFS encryption cycle

## Structure
```
test_hdfs/
├── test_encryption.py
├── test_encryption02.py
├── test_encryption03.py
├── test_config.py #stores all constants and HDFS commands
├── conftest.py #sets up the environment
├── utils.py #utility methods

```

---

## Extra Features

- **Markers:**
Markers have been used to selectively run specific test cases, improving test efficiency and organization.

---

### `setup_environment`

Handled in `Conftest.py` file
Before running the test cases, some environment configurations are needed:
- HDFS must communicate with KMS to fetch key details.
- Specific KMS properties are added to the `core-site.xml` file.
- Containers are restarted to apply the changes effectively.

---

### Utility Methods

- **get_error_logs:**
Fetches logs from both KMS and HDFS containers. Helps in identifying issues when errors or exceptions occur during testing.

- **run_command:**
Executes all necessary HDFS commands inside the containers.

---

## `test_encryption.py`

Handles the **full HDFS encryption cycle**, including setup, positive and negative test scenarios, and cleanup.

### Main Highlights:
- Encryption Zone (EZ) creation in HDFS.
- Granting permissions to specific users for read/write operations within the EZ.
- Validating read/write attempts by unauthorized users inside the EZ.


## Test Cases

### ✅ Positive Test Cases

1. **test_create_key:**
Creates an Encryption Zone (EZ) Key which is required to create an EZ.

2. **test_create_encryption_zone:**
Creates an Encryption Zone (EZ) using an existing EZ key.

3. **test_grant_permissions:**
Grants read-write permissions to a specific user (e.g., HIVE) within the EZ.

4. **test_hive_user_write_read:**
Performs write and read operations inside the EZ using the authorized HIVE user.

---

### ❌ Negative Test Cases

1. **test_unauthorized_write:**
Attempts to write inside the EZ using an unauthorized user (e.g., HBASE). Validates expected denial of access.

2. **test_unauthorized_read:**
Attempts to read inside the EZ using an unauthorized user. Validates expected denial of access.

---

### 🧹 Cleanup

- **test_cleanup:**
Cleans up the Encryption Zone and all files created during testing.
Deletes the EZ key created earlier.
Ensures the test environment is reset for clean re-runs.

---

## `test_encryption02.py`

Handles the **Check if after key roll over old files can be read or not**
**Check if after key roll over new files can be written and read too**
**Check read operation on file after key deletion**

---

## `test_encryption03.py`

Handles the **Test case on cross Encryption zone operations**



## Summary

This test suite ensures that **HDFS encryption and access control mechanisms** function as expected, validating both authorized and unauthorized access scenarios while maintaining a clean and reusable test environment.
79 changes: 79 additions & 0 deletions PyTest-KMS-HDFS/test_hdfs/test_config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@

##Contains all constant values regarding USER, PATH, HDFS Commands----------------------


HDFS_USER = "hdfs"
HIVE_USER = "hive"
HBASE_USER= "hbase"
KEY_ADMIN="keyadmin"
HEADERS={"Content-Type": "application/json","Accept":"application/json"}
PARAMS={"user.name":"keyadmin"}
BASE_URL="http://localhost:9292/kms/v1"
HADOOP_CONTAINER = "ranger-hadoop"
HDFS_USER = "hdfs"
KMS_CONTAINER = "ranger-kms"

#KMS configs that needs to be added in XML file------------add more if needed
KMS_PROPERTY = """<property><name>hadoop.security.key.provider.path</name><value>kms://[email protected]:9292/kms</value></property>"""

CORE_SITE_XML_PATH = "/opt/hadoop/etc/hadoop/core-site.xml"

# Ensure PATH is set for /opt/hadoop/bin
SET_PATH_CMD="echo 'export PATH=/opt/hadoop/bin:$PATH' >> /etc/profile && export PATH=/opt/hadoop/bin:$PATH"

HADOOP_NAMENODE_LOG_PATH="/opt/hadoop/logs/hadoop-hdfs-namenode-ranger-hadoop.example.com.log"

KMS_LOG_PATH="/var/log/ranger/kms/ranger-kms-ranger-kms.example.com-root.log"


# HDFS Commands----------------------------------------------------
CREATE_KEY_COMMAND = "hadoop key create {key_name} -size 128 -provider kms://[email protected]:9292/kms"

VALIDATE_KEY_COMMAND = "hadoop key list -provider kms://[email protected]:9292/kms"

CREATE_EZ_COMMANDS = [
"hdfs dfs -mkdir /{ez_name}",
"hdfs crypto -createZone -keyName {key_name} -path /{ez_name}",
"hdfs crypto -listZones"
]

GRANT_PERMISSIONS_COMMANDS = [
"hdfs dfs -chmod -R 700 /{ez_name}",
"hdfs dfs -chown -R {user}:{user} /{ez_name}"
]

CREATE_FILE_COMMAND = [ 'echo "{filecontent}" > /home/{user}/{filename}.txt && ls -l /home/{user}/{filename}.txt' ]

ACTIONS_COMMANDS = [
"hdfs dfs -put /home/{user}/{filename}.txt /{ez_name}/",
"hdfs dfs -ls /{ez_name}/",
"hdfs dfs -cat /{ez_name}/{filename}.txt"
]

CROSS_EZ_ACTION_COMMANDS = [
"hdfs dfs -put /home/{user}/{filename}.txt /{ez_name}/{dirname}/",
"hdfs dfs -ls /{ez_name}/",
"hdfs dfs -cat /{ez_name}/{dirname}/{filename}.txt"
]

READ_EZ_FILE=[
"hdfs dfs -cat /{ez_name}/{filename}.txt"
]

UNAUTHORIZED_WRITE_COMMAND = 'hdfs dfs -put /home/{user}/{filename}.txt /{ez_name}/'

UNAUTHORIZED_READ_COMMAND = "hdfs dfs -cat /{ez_name}/{filename}.txt"

CLEANUP_COMMANDS = [
"hdfs dfs -rm /{ez_name}/{filename}.txt",
"hdfs dfs -rm -R /{ez_name}"
]
CLEANUP_EZ = [
"hdfs dfs -rm -R /{ez_name}"
]
CLEANUP_EZ_FILE = [
"hdfs dfs -rm /{ez_name}/{filename}.txt"
]
KEY_DELETION_CMD = "bash -c \"echo 'Y' | hadoop key delete {key_name} -provider kms://[email protected]:9292/kms\""


Loading