New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[Draft]Adjustment to the PCA Approach #41

Open

ghost wants to merge 27 commits into develop from basic_PCA

ghost commented Mar 2, 2021 •

edited by ghost

Loading

Purpose

Describe the problem or feature in addition to a link to the issues.

Approach

How does this change address the problem?

Tests for New Behavior

What new tests were added to cover new features or behaviors?

Checklist

Make sure you did the following (if applicable):

Added tests for any new features or behaviors.
Ran ./pylint to make sure code style is consistent.
Built and reviewed the docs.
Added a note to the changelog.

Learning

Describe the research stage

Links to blog posts, patterns, libraries or addons used to solve this problem


          Create pca approach section in the doc

ghost added documentation enhancement labels

ghost requested a review from PanPip

March 2, 2021 16:38

ghost self-assigned this


          [Draft]Add an option to use a variable value of explained variance(li…

101add9

…ke 45%) by PCA factors.

Author

ghost commented Mar 7, 2021

Add an option to use a variable value of explained variance(like 45%, 55%, 65%) by PCA factors.

jamiekeng1016 and others added 21 commits

March 6, 2021 18:00


          Add the variable, explained_var in test file

73b86d0


          bug fixed

fbe104e


          Update pca_approach.py

6da4e78


          Merge branch 'develop' into basic_PCA

74fc5ba


          Merge branch 'basic_PCA' of https://github.com/hudson-and-thames/arbi…

21bcdc4

…tragelab into basic_PCA


          Add stationary test, modified sscore

d8edcdd


          [Draft]Add stationary test, drift.

07489bc


          [Draft] add volume_modified_return function

962cf3b


          [Draft] add asymptotic PCA option

dde8280


          [Draft] fix explained_variance bug

b849a33


          [Draft]add etf_approach.py

28e0830


          [Draft] update unittest , fix bugs

5208c08


          [Draft] etf_approach style check

5049ecf


          [Draft] add unittest for volume_data/ add stock_volume.csv

ef5515e


          [Draft]add docstring to test_volume_modified_return

82124f9


          [Draft] delete comment made by mistake

bc9481d


          [Draft] fix trailing white space

ae45880


          [Draft] volume_modified_return bug fixed

abe6e38


          [Draft] residual stationarity p_value option

9100b40


          [Draft]etf_approach unittest, speed of mr adjustment,

4c98d70


          [Draft]unittest adjustment

223d780

PanPip reviewed

View reviewed changes

Contributor

PanPip left a comment

Good progress 👌

I left some comments regarding the code that we can discuss.

Now we should polish the docstrings and start writing the sphinx docs.

arbitragelab/other_approaches/pca_approach.py Outdated

Comment on lines 17 to 18

		# pylint: disable=invalid-name
		# pylint: disable=R0913

Contributor

PanPip Mar 18, 2021

Let's rather do

# pylint: disable=invalid-name, too-many-arguments

arbitragelab/other_approaches/pca_approach.py Outdated

+                      :param matrix: (pd.DataFrame) DataFrame with returns that need to be standardized.
+                      :param vol_matrix: (pd.DataFrame) DataFrame with histoircal trading volume data.
+                      :param k: (int) Look-back window used for volume moving average.
+                      :return: (pd.DataFrame) a volume-adjusted returns dataFrame

Contributor

PanPip Mar 18, 2021

:return: (pd.DataFrame) A volume-adjusted returns dataFrame.

arbitragelab/other_approaches/pca_approach.py Outdated

Comment on lines 64 to 65

		# Fill missing data with preceding values
		returns = matrix.dropna(axis=0)

Contributor

PanPip Mar 18, 2021

Should we rather fill values?

arbitragelab/other_approaches/pca_approach.py Outdated

Comment on lines 116 to 117

		# Standardized: fill nan with zero / std: fill nan with 1

Contributor

PanPip Mar 18, 2021

This can probably be removed now.

arbitragelab/other_approaches/pca_approach.py Outdated

                       So the output is a dataframe containing the weight for each asset in a portfolio for each eigen vector.
                       :param matrix: (pd.DataFrame) Dataframe with index and columns containing asset returns.
+                      :param explained_var (float) The user-defined explained variance criteria.

Contributor

PanPip Mar 18, 2021

We should add that if this parameter is given it will override the n_components parameter. And also mention that it should've in the range from 0 to 1.

tests/test_etf_approach.py Outdated

Comment on lines 6 to 18

+              Tests the PCA Strategy from the Other Approaches module.
+              """
+              import unittest
+              import os
+              import pandas as pd
+              import numpy as np
+              from arbitragelab.other_approaches import ETFStrategy
+              class TestPCAStrategy(unittest.TestCase):
+                  """
+                  Tests PCAStrategy class.

Contributor

PanPip Mar 18, 2021

The naming should be fixed.

tests/test_etf_approach.py Outdated

Comment on lines 113 to 126

+                      # Check target weights
+                      self.assertAlmostEqual(target_weights.mean()['EEM'], 0.333333, delta=1e-5)
+                      self.assertAlmostEqual(target_weights.mean()['BND'], -0.5, delta=1e-5)
+                      self.assertAlmostEqual(target_weights.mean()['SPY'], -0.38888, delta=1e-5)
+                      # Check drift argument
+                      target_weights = self.etf_strategy.get_signals(smaller_etf, smaller_dataset, k=1, corr_window=252,
+                                                                     residual_window=60, sbo=1.25, sso=1.25, ssc=0.5,
+                                                                     sbc=0.75, size=1, drift=True)
+                      # Check target weights
+                      self.assertAlmostEqual(target_weights.mean()['EEM'], 0.333333, delta=1e-5)
+                      self.assertAlmostEqual(target_weights.mean()['BND'], -0.5, delta=1e-5)
+                      self.assertAlmostEqual(target_weights.mean()['SPY'], -0.38888, delta=1e-5)

Contributor

PanPip Mar 18, 2021

It's interesting that these test values are the same.

tests/test_etf_approach.py Outdated

Comment on lines 134 to 137

+                      # Check target weights
+                      self.assertAlmostEqual(target_weights.mean()['EEM'], 0.333333, delta=1e-5)
+                      self.assertAlmostEqual(target_weights.mean()['BND'], -0.5, delta=1e-5)
+                      self.assertAlmostEqual(target_weights.mean()['SPY'], -0.38888, delta=1e-5)

Contributor

PanPip Mar 18, 2021

And these too. Can we pick the values of the parameters so the outputs are different?

arbitragelab/other_approaches/etf_approach.py Outdated

+                  def __init__(self, n_components: int = 15):
+                      """
+                      Initialize PCA StatArb Strategy.

Contributor

PanPip Mar 18, 2021

Docstrings in this class should be fixed.

arbitragelab/other_approaches/etf_approach.py Outdated

Comment on lines 327 to 330

+                      First, the correlation matrix to get PCA components is calculated using a
+                      corr_window parameter. From this, we get weights to calculate PCA factor returns.
+                      These weights are being recalculated each time we generate (residual_window) number
+                      of signals.

Contributor

PanPip Mar 18, 2021

All these descriptions should be updated to match the ETF Approach.


          [Draft]docstring of etf approach

PanPip added 3 commits

March 19, 2021 13:51


          Adjusted the PCA Approach file structure

608432b


          Adjusted PCA Strategy code logic

f5478e7


          Adjusted PCA Approach unit tests

c51d8f3

PanPip reviewed

View reviewed changes

Contributor

PanPip left a comment

Made some code fixes to this PR.

arbitragelab/pca_approach/pca_approach.py

Comment on lines +132 to +134

+                          condition = min(np.cumsum(expl_variance), key=lambda x: abs(x - explained_var))
+                          # The number of components to use
+                          num_pc = np.where(np.cumsum(expl_variance) == condition)[0][0] + 1

Contributor

PanPip Mar 19, 2021

This part is not working as expected, I'll show an example.

arbitragelab/pca_approach/pca_approach.py

Comment on lines +149 to +156

+                      A function to calculate weights (scaled eigen vectors) to use for factor return calculation with
+                      asymptotic PCA.
+                      Weights are calculated from PCA components as:
+                      Weight = Eigen vector / std.(R)
+                      So the output is a dataframe containing the weight for each asset in a portfolio for each eigen vector.

Contributor

PanPip Mar 19, 2021

Please adjust this docstring to reflect the idea behind the asym PCA.

PanPip assigned PanPip and unassigned ghost

PanPip removed their assignment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation enhancement