Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example data error #5

Open
rwdavies opened this issue Oct 8, 2020 · 2 comments
Open

Example data error #5

rwdavies opened this issue Oct 8, 2020 · 2 comments

Comments

@rwdavies
Copy link

rwdavies commented Oct 8, 2020

Hi,

I tried running the example data with the following code and got the following error.

Thanks
Robbie

set -e
tmp_dir=$(mktemp)
rm ${tmp_dir}
mkdir -p ${tmp_dir}
cd ${tmp_dir}

echo ${tmp_dir}
git clone https://github.com/immunogenomics/HLA-TAPAS
cd HLA-TAPAS

python HLA-TAPAS.py \
    --target example/Case+Control.300+300.chr6.hg18 \
    --reference example/1000G.EUR.chr6.hg18.28mb-35mb \
    --hped-Ggroup example/1000G.EUR.Ggroup.hped \
    --pheno example/Case+Control.300+300.phe \
    --hg 18 \
    --out MyHLA-TAPAS/Case+Control+1000G_EUR_REF \
   --mem 16g \
   --nthreads 4
/tmp/tmp.GVS8KUH4KH
Cloning into 'HLA-TAPAS'...
Checking out files:  18% (45/250)   

[redacted, it checks out files for a while]

Checking out files: 100% (250/250)   
Checking out files: 100% (250/250), done.
Warning: Variants 'HLA_A*01:01:01G' and 'HLA_A*01' have the same position.
Warning: Variants 'HLA_A*02' and 'HLA_A*01:01:01G' have the same position.
Warning: Variants 'HLA_A*02:01:01G' and 'HLA_A*02' have the same position.
5296 more same-position warnings: see log file.
Namespace(aa_only=False, condition=None, condition_gene=None, condition_list=None, covar=None, covar_name=None, dependency='dependency/', exclude_composites=False, exhaustive=False, exhaustive_aa_pos=None, exhaustive_max_aa=2, exhaustive_min_aa=2, exhaustive_no_filter=False, hg='18', hped='example/1000G.EUR.Ggroup.hped', maf_threshold=0.005, mem='16g', min_haplo_count=10, niterations=5, nthreads=4, out='MyHLA-TAPAS/Case+Control+1000G_EUR_REF', output_composites=False, pcs=None, pheno='example/Case+Control.300+300.phe', pheno_name=None, pop=None, reference='example/1000G.EUR.chr6.hg18.28mb-35mb', reference_bim=None, remove_samples_aa_pattern=None, remove_samples_by_haplo=False, save_intermediates=False, sex=None, target='example/Case+Control.300+300.chr6.hg18', tolerated_diff=0.15)

[HLA-TAPAS.py]: Generating G-group CHPED with given HPED('example/1000G.EUR.Ggroup.hped').
[NomenCleaner.py]: Generating CHPED with G code HLA alleles.
[HLA-TAPAS.py]: Generated CHPED: 'MyHLA-TAPAS/Case+Control+1000G_EUR_REF.Ggroup.chped'.


[HLA-TAPAS.py]: Generating Reference panel.
[MakeReference_v2.py]: Making Reference Panel for "MyHLA-TAPAS/Case+Control+1000G_EUR_REF.REF.bglv4"
[1] Generating Amino acid(AA)sequences from HLA types.
[2] Encoding Amino acids positions.
[3] Encoding HLA alleles.
[4] Generating DNA(SNPS) sequences from HLA types.
[5] Encoding SNP positions.
[6] Extracting founders.
[7] Merging SNP, HLA, and amino acid datasets.
[8] Performing quality control.
[9] Preparing files for Beagle.
[10] Converting PLINK to BEAGLE format.
[11] Converting BEAGLE to VCF format.
[12] Phasing reference using Beagle4.1.
[13] Making reference panel for HLA-AA,SNPS,HLA and Normal variants(SNPs) is Done!
[HLA-TAPAS.py]: Generated Reference panel : 'MyHLA-TAPAS/Case+Control+1000G_EUR_REF.REF.bglv4'.


[HLA-TAPAS.py]: Performing SNP2HLA imputation.
SNP2HLA: Performing HLA imputation for dataset example/Case+Control.300+300.chr6.hg18
- Java memory = 16gb
[1] Extracting SNPs from the MHC.
[2] Performing SNP quality control.
[3] Converting PLINK to BEAGLE format.
[4] Converting BEAGLE to VCF format.
[5] Performing HLA imputation.
[HLA-TAPAS.py]: Imputed result : 'MyHLA-TAPAS/Case+Control+1000G_EUR_REF.IMPUTED.bgl.phased.vcf.gz'


[HLAassoc.py::WARNING]: Using phenotype column 'RA' in 'example/Case+Control.300+300.phe' file.
[HLAassoc.py]: Performing Logistic Regression.
[HLA-TAPAS.py]: Output Logistic Regression result : 'MyHLA-TAPAS/Case+Control+1000G_EUR_REF.IMPUTED.assoc.logistic'.


[HLAassoc.py]: Phased BEAGLE file will be generated from given VCF file('MyHLA-TAPAS/Case+Control+1000G_EUR_REF.IMPUTED.bgl.phased.vcf.gz').
[HLAassoc.py]: Top 10 PCs will be generated from given VCF file('MyHLA-TAPAS/Case+Control+1000G_EUR_REF.IMPUTED.bgl.phased.vcf.gz').
[HLAassoc.py::WARNING]: All samples will be assumed to be originated from same population.
Traceback (most recent call last):
  File "HLA-TAPAS.py", line 409, in <module>
    f_exhaustive_no_filter=args.exhaustive_no_filter
  File "HLA-TAPAS.py", line 171, in HLA_TAPAS
    _java_heap_mem=b_mem)
  File "/tmp/tmp.GVS8KUH4KH/HLA-TAPAS/HLAassoc/HLAassoc.py", line 645, in __init__
    if self.hasSEXinFAM(self.fam):
  File "/tmp/tmp.GVS8KUH4KH/HLA-TAPAS/HLAassoc/HLAassoc.py", line 985, in hasSEXinFAM
    f_NA3 = df_fam['Sex'].isna()
  File "/apps/well/python/3.5.2-gcc5.4.0/lib/python3.5/site-packages/pandas/core/generic.py", line 2360, in __getattr__
    (type(self).__name__, name))
AttributeError: 'Series' object has no attribute 'isna'

@WansonChoi
Copy link
Collaborator

I rechecked running the example data on my system and It worked fine. So, I guess this maybe due to different system setting.

Can you tell me what is your Pandas version first?

Traceback (most recent call last):
File "HLA-TAPAS.py", line 409, in
f_exhaustive_no_filter=args.exhaustive_no_filter
File "HLA-TAPAS.py", line 171, in HLA_TAPAS
_java_heap_mem=b_mem)
File "/tmp/tmp.GVS8KUH4KH/HLA-TAPAS/HLAassoc/HLAassoc.py", line 645, in init
if self.hasSEXinFAM(self.fam):
File "/tmp/tmp.GVS8KUH4KH/HLA-TAPAS/HLAassoc/HLAassoc.py", line 985, in hasSEXinFAM
f_NA3 = df_fam['Sex'].isna()
File "/apps/well/python/3.5.2-gcc5.4.0/lib/python3.5/site-packages/pandas/core/generic.py", line 2360, in getattr
(type(self).name, name))
AttributeError: 'Series' object has no attribute 'isna'

In the error message, The 'isna' is one of Pandas functions('https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.isna.html') and caused no trouble to me.

cf. My Pandas version was 1.0.5.

Thanks,
Wanson

@rwdavies
Copy link
Author

I upgraded pandas to a later version 0.24.2 as on this machine I only have Python 3.5.2 (and do not have system write access and didn't want to manually install Python). It does run which looks good, though I get some weird looking error messages, so not sure this is out of the woods yet?

set -e
tmp_dir=$(mktemp)
rm ${tmp_dir}
mkdir -p ${tmp_dir}
cd ${tmp_dir}

echo ${tmp_dir}
git clone https://github.com/immunogenomics/HLA-TAPAS
cd HLA-TAPAS

mkdir -p /well/davies/users/dcc832/bin/python_packages
pip3 install --target=/well/davies/users/dcc832/bin/python_packages/ pandas
export PYTHONPATH=/well/davies/users/dcc832/bin/python_packages/
python -c "import pandas; print(pandas.__version__)"
## note this now gives version 0.24.1

python HLA-TAPAS.py \
    --target example/Case+Control.300+300.chr6.hg18 \
    --reference example/1000G.EUR.chr6.hg18.28mb-35mb \
    --hped-Ggroup example/1000G.EUR.Ggroup.hped \
    --pheno example/Case+Control.300+300.phe \
    --hg 18 \
    --out MyHLA-TAPAS/Case+Control+1000G_EUR_REF \
   --mem 16g \
   --nthreads 4
/tmp/tmp.7ZYh27j1KG
Cloning into 'HLA-TAPAS'...
Checking out files:  26% (67/250)   
[redacted many checking out files lines] 
Checking out files: 100% (250/250), done.
Collecting pandas
  Using cached https://files.pythonhosted.org/packages/74/24/0cdbf8907e1e3bc5a8da03345c23cbed7044330bb8f73bb12e711a640a00/pandas-0.24.2-cp35-cp35m-manylinux1_x86_64.whl
Collecting numpy>=1.12.0 (from pandas)
  Using cached https://files.pythonhosted.org/packages/b5/36/88723426b4ff576809fec7d73594fe17a35c27f8d01f93637637a29ae25b/numpy-1.18.5-cp35-cp35m-manylinux1_x86_64.whl
Collecting pytz>=2011k (from pandas)
  Using cached https://files.pythonhosted.org/packages/4f/a4/879454d49688e2fad93e59d7d4efda580b783c745fd2ec2a3adf87b0808d/pytz-2020.1-py2.py3-none-any.whl
Collecting python-dateutil>=2.5.0 (from pandas)
  Using cached https://files.pythonhosted.org/packages/d4/70/d60450c3dd48ef87586924207ae8907090de0b306af2bce5d134d78615cb/python_dateutil-2.8.1-py2.py3-none-any.whl
Collecting six>=1.5 (from python-dateutil>=2.5.0->pandas)
  Using cached https://files.pythonhosted.org/packages/ee/ff/48bde5c0f013094d729fe4b0316ba2a24774b3ff1c52d924a8a4cb04078a/six-1.15.0-py2.py3-none-any.whl
Installing collected packages: numpy, pytz, six, python-dateutil, pandas
Successfully installed numpy-1.18.5 pandas-0.24.2 python-dateutil-2.8.1 pytz-2020.1 six-1.15.0
Target directory /well/davies/users/dcc832/bin/python_packages/numpy already exists. Specify --upgrade to force replacement.
Target directory /well/davies/users/dcc832/bin/python_packages/numpy-1.18.5.dist-info already exists. Specify --upgrade to force replacement.
Target directory /well/davies/users/dcc832/bin/python_packages/numpy.libs already exists. Specify --upgrade to force replacement.
Target directory /well/davies/users/dcc832/bin/python_packages/pytz already exists. Specify --upgrade to force replacement.
Target directory /well/davies/users/dcc832/bin/python_packages/pytz-2020.1.dist-info already exists. Specify --upgrade to force replacement.
Target directory /well/davies/users/dcc832/bin/python_packages/six.py already exists. Specify --upgrade to force replacement.
Target directory /well/davies/users/dcc832/bin/python_packages/six-1.15.0.dist-info already exists. Specify --upgrade to force replacement.
Target directory /well/davies/users/dcc832/bin/python_packages/__pycache__ already exists. Specify --upgrade to force replacement.
Target directory /well/davies/users/dcc832/bin/python_packages/dateutil already exists. Specify --upgrade to force replacement.
Target directory /well/davies/users/dcc832/bin/python_packages/python_dateutil-2.8.1.dist-info already exists. Specify --upgrade to force replacement.
Target directory /well/davies/users/dcc832/bin/python_packages/pandas already exists. Specify --upgrade to force replacement.
Target directory /well/davies/users/dcc832/bin/python_packages/pandas-0.24.2.dist-info already exists. Specify --upgrade to force replacement.
You are using pip version 9.0.1, however version 20.2.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
0.24.2
Warning: Variants 'HLA_A*01:01:01G' and 'HLA_A*01' have the same position.
Warning: Variants 'HLA_A*02' and 'HLA_A*01:01:01G' have the same position.
Warning: Variants 'HLA_A*02:01:01G' and 'HLA_A*02' have the same position.
5296 more same-position warnings: see log file.
Namespace(aa_only=False, condition=None, condition_gene=None, condition_list=None, covar=None, covar_name=None, dependency='dependency/', exclude_composites=False, exhaustive=False, exhaustive_aa_pos=None, exhaustive_max_aa=2, exhaustive_min_aa=2, exhaustive_no_filter=False, hg='18', hped='example/1000G.EUR.Ggroup.hped', maf_threshold=0.005, mem='16g', min_haplo_count=10, niterations=5, nthreads=4, out='MyHLA-TAPAS/Case+Control+1000G_EUR_REF', output_composites=False, pcs=None, pheno='example/Case+Control.300+300.phe', pheno_name=None, pop=None, reference='example/1000G.EUR.chr6.hg18.28mb-35mb', reference_bim=None, remove_samples_aa_pattern=None, remove_samples_by_haplo=False, save_intermediates=False, sex=None, target='example/Case+Control.300+300.chr6.hg18', tolerated_diff=0.15)

[HLA-TAPAS.py]: Generating G-group CHPED with given HPED('example/1000G.EUR.Ggroup.hped').
[NomenCleaner.py]: Generating CHPED with G code HLA alleles.
[HLA-TAPAS.py]: Generated CHPED: 'MyHLA-TAPAS/Case+Control+1000G_EUR_REF.Ggroup.chped'.


[HLA-TAPAS.py]: Generating Reference panel.
[MakeReference_v2.py]: Making Reference Panel for "MyHLA-TAPAS/Case+Control+1000G_EUR_REF.REF.bglv4"
[1] Generating Amino acid(AA)sequences from HLA types.
[2] Encoding Amino acids positions.
[3] Encoding HLA alleles.
[4] Generating DNA(SNPS) sequences from HLA types.
[5] Encoding SNP positions.
[6] Extracting founders.
[7] Merging SNP, HLA, and amino acid datasets.
[8] Performing quality control.
[9] Preparing files for Beagle.
[10] Converting PLINK to BEAGLE format.
[11] Converting BEAGLE to VCF format.
[12] Phasing reference using Beagle4.1.
[13] Making reference panel for HLA-AA,SNPS,HLA and Normal variants(SNPs) is Done!
[HLA-TAPAS.py]: Generated Reference panel : 'MyHLA-TAPAS/Case+Control+1000G_EUR_REF.REF.bglv4'.


[HLA-TAPAS.py]: Performing SNP2HLA imputation.
SNP2HLA: Performing HLA imputation for dataset example/Case+Control.300+300.chr6.hg18
- Java memory = 16gb
[1] Extracting SNPs from the MHC.
[2] Performing SNP quality control.
[3] Converting PLINK to BEAGLE format.
[4] Converting BEAGLE to VCF format.
[5] Performing HLA imputation.
[HLA-TAPAS.py]: Imputed result : 'MyHLA-TAPAS/Case+Control+1000G_EUR_REF.IMPUTED.bgl.phased.vcf.gz'


[HLAassoc.py::WARNING]: Using phenotype column 'RA' in 'example/Case+Control.300+300.phe' file.
[HLAassoc.py]: Performing Logistic Regression.
[HLA-TAPAS.py]: Output Logistic Regression result : 'MyHLA-TAPAS/Case+Control+1000G_EUR_REF.IMPUTED.assoc.logistic'.


[HLAassoc.py]: Phased BEAGLE file will be generated from given VCF file('MyHLA-TAPAS/Case+Control+1000G_EUR_REF.IMPUTED.bgl.phased.vcf.gz').
[HLAassoc.py]: Top 10 PCs will be generated from given VCF file('MyHLA-TAPAS/Case+Control+1000G_EUR_REF.IMPUTED.bgl.phased.vcf.gz').
[HLAassoc.py::WARNING]: All samples will be assumed to be originated from same population.
[HLAassoc.py::WARNING]: Sex information in given fam file('example/Case+Control.300+300.chr6.hg18.fam') will be used.
[HLAassoc.py]: Performing Omnibus Test.

[HLAassoc.py::ERROR]: Omnibus Test failed. See the log file('MyHLA-TAPAS/Case+Control+1000G_EUR_REF.IMPUTED.OMNIBUS.OMlog').
[HLA-TAPAS.py]: Output Omnibus Test result : 'None'.


[HLA-TAPAS.py]: Plotting Manhattan Plot.
[HLA-TAPAS.py]: Manhattan plot result : 'MyHLA-TAPAS/Case+Control+1000G_EUR_REF.IMPUTED.assoc.logistic.manhattan.pdf'.


[HLA-TAPAS.py]: HLA-TAPAS done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants