How to handle nutrition data #131422
Unanswered
ajama01
asked this question in
Programming Help
Replies: 2 comments 2 replies
-
import pandas as pd
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
import numpy as np
import matplotlib.pyplot as plt
# Load data
data = pd.read_csv('ffq_data.csv')
# Log transformation
data_log_transformed = np.log1p(data)
# Standardize
scaler = StandardScaler()
data_scaled = scaler.fit_transform(data_log_transformed)
# PCA time
pca = PCA(n_components=2)
principal_components = pca.fit_transform(data_scaled)
# Convert to DataFrame
pca_df = pd.DataFrame(data=principal_components, columns=['PC1', 'PC2'])
# Plot it
plt.scatter(pca_df['PC1'], pca_df['PC2'])
plt.xlabel('PC1')
plt.ylabel('PC2')
plt.title('PCA of FFQ Data')
plt.show() |
Beta Was this translation helpful? Give feedback.
2 replies
-
Thanks for posting in the GitHub Community, @ajama01 ! We’ve moved your post to our Programming Help 🧑💻 category, which is more appropriate for this type of discussion. Please review our guidelines about the Programming Help category for more information. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Select Topic Area
Question
Body
Here is the translation to English:
Hello everyone, I'm currently working on a data from an FFQ (Food Frequency Questionnaire). I've tried to perform PCA on this data, the results of the PCA are not really interpretable (overlapping of variables and individuals) the problem is that some columns corresponding to foods are mostly 0, I don't know how I can handle this kind of data. What do you suggest?
Beta Was this translation helpful? Give feedback.
All reactions