Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Daily Standup in June #1

Closed
nissy-dev opened this issue Jun 2, 2020 · 14 comments
Closed

Daily Standup in June #1

nissy-dev opened this issue Jun 2, 2020 · 14 comments

Comments

@nissy-dev
Copy link
Owner

nissy-dev commented Jun 2, 2020

Daily standup in June

I use this template
https://www.range.co/blog/complete-guide-daily-standup-meeting-agenda

6/2

Yesterday

  • setup local environment
    • I try to use a docker, but my PC have no enough volume for building JAX chem image....

Today

I will get used to jax.

Blockers

Today, I think it is no bockers because today's task is just inputting knowledge.

@nissy-dev
Copy link
Owner Author

nissy-dev commented Jun 2, 2020

6/3

Yesterday

I try some tutorials about JAX and I get used to JAX.
I categorized them into required ones and optional ones.
My notebook is here

required notebook

optional notebook

  • The Autodiff Cookbook
    • I seem that we only need Gradients section at the beginning.
  • Autobatching log-densities example
    • I seem that we only need until Write the log-joint function for the model section.
    • If you have experienced about full scratch NN implementation using Numpy, it is easy to understand vmap's advantage.

Today

I will get used to JAX with GNN.

  • Try a GNN tutorial based on this blog post
    • especially I focus on 3. JAX implementation section.

Blockers

Today, I think it is no bockers because today's task is just inputting knowledge.

@nissy-dev
Copy link
Owner Author

6/4

Yesterday

I tried a GCN tutorial about JAX and read mnist code about dm-haiku.
Haiku is a simple and light NN library and I will try to use with JAX-chem
My notebook is here

Today

I implement graph property prediction model by refactoring yesterday's tutorial GCN.
(Tutorial builds node classification model about Cora dataset.)

  • Build a graph property prediction model about Tox21

Blockers

  • I think this implementation is a little hard. Maybe I can't finish in one day.

@nissy-dev
Copy link
Owner Author

6/6

Yesterday

I implemented graph property prediction model by refactoring yesterday's tutorial GCN.
I made the draft PR. deepchem#3

Today

Mainly, I implement training codes with Tox21.

  • Implement Tox21 tutorial with graph property prediction model.

Blockers

  • I think this implementation is a little hard. Maybe I can't finish in one day.

@nissy-dev
Copy link
Owner Author

nissy-dev commented Jun 8, 2020

6/8

Yesterday

Mainly, I implemented GCN codes with Tox21.

Today

Mainly, I implement GCN codes with Tox21.

PS. I finished implementation about GCN with Tox21 and rethink my plan.
Please confirm this issue deepchem#1

Blockers

Today, I think it is no bockers

@nissy-dev
Copy link
Owner Author

nissy-dev commented Jun 10, 2020

6/10

Yesterday (6/8)

I implemented GCN codes with Tox21.

Today

Mainly, I setup test environment and look into pytest.
Before refactoring with Haiku, I will just write shape tests.
After refactoring with Haiku and officially deciding to use Haiku in jaxchem,
I will write another tests.

Blockers

Today, I think it is no bockers

@nissy-dev
Copy link
Owner Author

nissy-dev commented Jun 11, 2020

6/12

Yesterday

I setup some environments and implemented normalization of adjacency matrix for GCN

  • implemented sample test by using pytest
  • add config for github actions
  • add setup.py
  • implemented the normalization of adjacency matrix for GCN

Today

Mainly, I will write shape test for GCN

blockers

Today, I think it is no bockers

@nissy-dev
Copy link
Owner Author

nissy-dev commented Jun 15, 2020

6/15

Yesterday

I wrote shape test for GCN

Today

Mainly, I will refactor GCN model using Haiku. Today, I knew jax-md uses haiku.

blockers

I think it is no bockers. But, this implementation is a little hard.
Maybe I can't finish in one day.

@nissy-dev
Copy link
Owner Author

6/17

Yesterday

I took the day off.

Today

Mainly, I will refactor GCN model using Haiku.

blockers

I think it is no bockers. But, this implementation is a little hard.
Maybe I can't finish in one day.

@nissy-dev
Copy link
Owner Author

nissy-dev commented Jun 19, 2020

6/19

Yesterday

I finished refactoring GCN model using Haiku.

Today

Mainly, I will implement sparse pattern GCN model.
Previous models's input is adjacency matrix, but this model's input is adjacency list.
Adjacency list is more memory efficient compared with adjacency matrix.
I posted the next detail plan.

  • Implement sparse pattern GCN model. <- Today's work (6/20~22)
  • add QM9 example (6/24~25)
  • compare the training time among jaxchem, DGL, DeepChem and PyTorch Geometric (6/26~27)
  • add colab notebook example (6/29~30)
  • post Blog post in deepchem forum (6/30)

blockers

I think it is no bockers.

@nissy-dev
Copy link
Owner Author

6/22

Yesterday

I Implemented the sparse pattern GCN model.

Today

I Implemented the sparse pattern GCN model and example.
I faced the performance issue and I'm struggling resolving it.

blockers

I think it is no bockers.

@nissy-dev
Copy link
Owner Author

nissy-dev commented Jun 26, 2020

6/26

Yesterday

I couldn't work for jaxchem in 6/23, 6/24 because of my research tasks.
I started resolving the performance issue from yesterday.

Today

I'm struggling resolving the performance issue.

The reason of the performance issue is related to jax-ml/jax#2242.
SparseCGN uses jax.ops.index_add, but a large Python "for" loop leads to a serious performance issue when using jax.ops.index_add (training time of this example is 24 times than the example of PadGCN)

I try to rewrite training loop codes using lax.scan or lax.fori_loop. However, the generator/iterator doesn't work in lax.scan or lax.fori_loop, so I take much time than expected. (I made the issue in jax repo, jax-ml/jax#3567)

If I'm not able to use generator/iterator, maybe I have to write some codes which convert DiskDataset to original Dataset.
The plan will be delayed for a few days. The following plan is the worst case. I want to post blog by the first evaluation term's end (7/4).

blockers

I think it is no bockers.

@nissy-dev
Copy link
Owner Author

nissy-dev commented Jun 29, 2020

6/29

Yesterday

I was resolving the performance issue, but I couldn't.

Today

Today, I write the summary and issues and rethink the plan.

In this week, I decide to focus on writing summary and issue details for the GSoC evaluation.
I will also make colab notebook example and post blogs for DeepChem Forum.

Updated Plan

  • Resolve performance issue -> skip, next period task
  • add QM9 example -> skip, next period task
  • compare the training time (Tox21) among jaxchem, DGL, DeepChem and PyTorch Geometric (6/30~7/2)
  • add colab notebook example (7/3)
  • post Blog post in deepchem forum (7/3)

blockers

I think it is no bockers.

@nissy-dev
Copy link
Owner Author

nissy-dev commented Jun 29, 2020

Summary for 1st evaluation period

I spent four weeks in joining the DeepChem as a GSoC student and the 1st evaluation has come!
I want to explain what I did in four weeks.

JAXChem

Summary

As I mentioned in this roadmap deepchem#1, I tried to implement GCN models and make tutorials during 1st evaluation period. The reason why I chose this topic is that the GCN(GNN) is the most popular method as an example of deep learning in the area of chemistry. I think this is a good starting point for JAXChem.

During 1st evaluation period, I implemented the two pattern GCN model.

If you want to confirm the details about the difference between two models, please check the roadmap deepchem#1.
One of the challenging point of JAXChem is to implement the sparse pattern GCN model. Pad pattern modeling is easier and the blog was published like this.

While implementing these models, I modified the roadmap in June (deepchem#1 (comment)) by following some advices and I prioritized to make our codes more readable and maintainable. I listed up what I did.

Issues

I found the performance issue about the sparse pattern GCN model when making the Tox21 example.

The reason of the performance issue is related to jax-ml/jax#2242. The sparse pattern GCN model uses jax.ops.index_add, but a large Python "for" loop leads to a serious performance issue when using jax.ops.index_add (Training time/epoch of the Tox21 example is 24 times than the pad pattern GCN model)

In order to resolve this issue, I have to rewrite training loop using lax.scan or lax.fori_loop. However, lax.scan or lax.fori_loop have some limitations like the generator/iterator doesn't work (See : jax-ml/jax#3567), so it is difficult to rewrite. Now, I'm struggling this issue and please confirm the more details in this issue deepchem#8

Next plan (during 2nd evaluation period)

According to the roadmap, I'm supposed to be working for implementing the CGCNN model. However, I will change this plan. In the next period, I will focus on resolving the performance issue and writing the documents. Please confirm the details below.

  • Build the JAXChem document (7/6 - 7/12)
  • Resolve the performance issue (7/13 - 7/20)
  • Resolve the performance issue (7/20 - 7/27)
  • Add more examples (7/27 - 8/3)

There are two reasons I will change the plan. First, the CGCNN model is similar to the sparse pattern model. Second, I seem that the crystal support of deepchem is currently too early stage and it still needs many fixes.
On the other hand, I will not change the plan (Implementing the Molecular Attention Transformer) about final evaluation period.

DeepChem

My official project is JAXChem, but I also have committed to DeepChem core codes. The reason is that JAXChem is one of the DeepChem projects. I think DeepChem core codes’ improvement is a really important for many users to know the JAXChem project and think they want to use it.

During 1st evaluation period, I mainly cleaned up old documentations or build systems. I listed up what I did in the details.

@nissy-dev
Copy link
Owner Author

Go to #3

@nissy-dev nissy-dev mentioned this issue Jul 4, 2020
15 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant