Daily Standup in June #1

nissy-dev · 2020-06-02T00:36:39Z

Daily standup in June

I use this template
https://www.range.co/blog/complete-guide-daily-standup-meeting-agenda

6/2

Yesterday

setup local environment
- I try to use a docker, but my PC have no enough volume for building JAX chem image....

Today

I will get used to jax.

Try a tutorial based on this blog post
- especially I focus on What is JAX section.
Try a official NN tutorial notebook about JAX
- I learned how to build NN model with JAX

Blockers

Today, I think it is no bockers because today's task is just inputting knowledge.

The text was updated successfully, but these errors were encountered:

nissy-dev · 2020-06-02T15:36:55Z

6/3

Yesterday

I try some tutorials about JAX and I get used to JAX.
I categorized them into required ones and optional ones.
My notebook is here

required notebook

JAX Quickstart
- First, we must know jit, grad and vmap
- I thought vmap is really interesting
Training a Simple Neural Network, with PyTorch Data Loading
- This is useful for catching up overall NN implementation which includes not only model but also loss function, param update and so on.
- JAX is easy to use with Pytorch or TF dataloader.

optional notebook

The Autodiff Cookbook
- I seem that we only need Gradients section at the beginning.
Autobatching log-densities example
- I seem that we only need until Write the log-joint function for the model section.
- If you have experienced about full scratch NN implementation using Numpy, it is easy to understand vmap's advantage.

Today

I will get used to JAX with GNN.

Try a GNN tutorial based on this blog post
- especially I focus on 3. JAX implementation section.

Blockers

Today, I think it is no bockers because today's task is just inputting knowledge.

nissy-dev · 2020-06-04T01:00:17Z

6/4

Yesterday

I tried a GCN tutorial about JAX and read mnist code about dm-haiku.
Haiku is a simple and light NN library and I will try to use with JAX-chem
My notebook is here

Today

I implement graph property prediction model by refactoring yesterday's tutorial GCN.
(Tutorial builds node classification model about Cora dataset.)

Build a graph property prediction model about Tox21

Blockers

I think this implementation is a little hard. Maybe I can't finish in one day.

nissy-dev · 2020-06-06T08:54:33Z

6/6

Yesterday

I implemented graph property prediction model by refactoring yesterday's tutorial GCN.
I made the draft PR. deepchem#3

Today

Mainly, I implement training codes with Tox21.

Implement Tox21 tutorial with graph property prediction model.

Blockers

I think this implementation is a little hard. Maybe I can't finish in one day.

nissy-dev · 2020-06-08T15:46:32Z

6/8

Yesterday

Mainly, I implemented GCN codes with Tox21.

Today

Mainly, I implement GCN codes with Tox21.

PS. I finished implementation about GCN with Tox21 and rethink my plan.
Please confirm this issue deepchem#1

Blockers

Today, I think it is no bockers

nissy-dev · 2020-06-10T07:23:25Z

6/10

Yesterday (6/8)

I implemented GCN codes with Tox21.

Today

Mainly, I setup test environment and look into pytest.
Before refactoring with Haiku, I will just write shape tests.
After refactoring with Haiku and officially deciding to use Haiku in jaxchem,
I will write another tests.

Blockers

Today, I think it is no bockers

nissy-dev · 2020-06-11T04:26:57Z

6/12

Yesterday

I setup some environments and implemented normalization of adjacency matrix for GCN

implemented sample test by using pytest
add config for github actions
add setup.py
implemented the normalization of adjacency matrix for GCN

Today

Mainly, I will write shape test for GCN

blockers

Today, I think it is no bockers

nissy-dev · 2020-06-15T00:50:23Z

6/15

Yesterday

I wrote shape test for GCN

Today

Mainly, I will refactor GCN model using Haiku. Today, I knew jax-md uses haiku.

blockers

I think it is no bockers. But, this implementation is a little hard.
Maybe I can't finish in one day.

nissy-dev · 2020-06-17T05:30:40Z

6/17

Yesterday

I took the day off.

Today

Mainly, I will refactor GCN model using Haiku.

blockers

I think it is no bockers. But, this implementation is a little hard.
Maybe I can't finish in one day.

nissy-dev · 2020-06-19T01:55:17Z

6/19

Yesterday

I finished refactoring GCN model using Haiku.

Today

Mainly, I will implement sparse pattern GCN model.
Previous models's input is adjacency matrix, but this model's input is adjacency list.
Adjacency list is more memory efficient compared with adjacency matrix.
I posted the next detail plan.

Implement sparse pattern GCN model. <- Today's work (6/20~22)
add QM9 example (6/24~25)
compare the training time among jaxchem, DGL, DeepChem and PyTorch Geometric (6/26~27)
add colab notebook example (6/29~30)
post Blog post in deepchem forum (6/30)

blockers

I think it is no bockers.

nissy-dev · 2020-06-22T15:24:54Z

6/22

Yesterday

I Implemented the sparse pattern GCN model.

Today

I Implemented the sparse pattern GCN model and example.
I faced the performance issue and I'm struggling resolving it.

blockers

I think it is no bockers.

nissy-dev · 2020-06-26T10:25:48Z

6/26

Yesterday

I couldn't work for jaxchem in 6/23, 6/24 because of my research tasks.
I started resolving the performance issue from yesterday.

Today

I'm struggling resolving the performance issue.

The reason of the performance issue is related to jax-ml/jax#2242.
SparseCGN uses jax.ops.index_add, but a large Python "for" loop leads to a serious performance issue when using jax.ops.index_add (training time of this example is 24 times than the example of PadGCN)

I try to rewrite training loop codes using lax.scan or lax.fori_loop. However, the generator/iterator doesn't work in lax.scan or lax.fori_loop, so I take much time than expected. (I made the issue in jax repo, jax-ml/jax#3567)

If I'm not able to use generator/iterator, maybe I have to write some codes which convert DiskDataset to original Dataset.
The plan will be delayed for a few days. The following plan is the worst case. I want to post blog by the first evaluation term's end (7/4).

blockers

I think it is no bockers.

nissy-dev · 2020-06-29T12:08:59Z

6/29

Yesterday

I was resolving the performance issue, but I couldn't.

Today

Today, I write the summary and issues and rethink the plan.

In this week, I decide to focus on writing summary and issue details for the GSoC evaluation.
I will also make colab notebook example and post blogs for DeepChem Forum.

Updated Plan

Resolve performance issue -> skip, next period task
add QM9 example -> skip, next period task
compare the training time (Tox21) among jaxchem, DGL, DeepChem and PyTorch Geometric (6/30~7/2)
add colab notebook example (7/3)
post Blog post in deepchem forum (7/3)

blockers

I think it is no bockers.

nissy-dev · 2020-06-29T13:19:08Z

Summary for 1st evaluation period

I spent four weeks in joining the DeepChem as a GSoC student and the 1st evaluation has come!
I want to explain what I did in four weeks.

JAXChem

Summary

As I mentioned in this roadmap deepchem#1, I tried to implement GCN models and make tutorials during 1st evaluation period. The reason why I chose this topic is that the GCN(GNN) is the most popular method as an example of deep learning in the area of chemistry. I think this is a good starting point for JAXChem.

During 1st evaluation period, I implemented the two pattern GCN model.

Implement the pad pattern GCN model (Tox21 example)
- PRs : Add simple GCN model with Tox21 tutorials deepchem/jaxchem#3 Refactor GCN with Haiku deepchem/jaxchem#5
- This model uses adjacency matrix (shape : (V, V)) for representing node connections
- Memory efficiency is not good, so it is difficult to treat a large graph
Implement the sparse pattern GCN model (Tox21 example)
- PRs : Implement sparse pattern GCN model deepchem/jaxchem#7
- This model uses adjacency list (shape : (2, E)) for representing node connections
- Memory efficiency is good, so there is a possibility of treating a large graph
- PyTorch Geometric adopts this style

If you want to confirm the details about the difference between two models, please check the roadmap deepchem#1.
One of the challenging point of JAXChem is to implement the sparse pattern GCN model. Pad pattern modeling is easier and the blog was published like this.

While implementing these models, I modified the roadmap in June (deepchem#1 (comment)) by following some advices and I prioritized to make our codes more readable and maintainable. I listed up what I did.

Setup CI
- PRs : Setup test deepchem/jaxchem#4
- The pipeline includes building, testing, typing and linting check.
Write unit tests
- PRs : Setup test deepchem/jaxchem#4
- Coverage is 97%
Write type annotation
- PRs : Refactor GCN with Haiku deepchem/jaxchem#5 Implement sparse pattern GCN model deepchem/jaxchem#7
Refactor these models with Haiku
- PRs : Refactor GCN with Haiku deepchem/jaxchem#5
- JAX is a little hard to manage params (See the init_func of archive models archive/gcn)
- init_func disrupts our focus on forward computation implementation
- Haiku provides a base class for making us focus on forward computation implementation
- Haiku adopts OOP style and our codes are more friendly for many users which used PyTorch

Issues

I found the performance issue about the sparse pattern GCN model when making the Tox21 example.

The reason of the performance issue is related to jax-ml/jax#2242. The sparse pattern GCN model uses jax.ops.index_add, but a large Python "for" loop leads to a serious performance issue when using jax.ops.index_add (Training time/epoch of the Tox21 example is 24 times than the pad pattern GCN model)

In order to resolve this issue, I have to rewrite training loop using lax.scan or lax.fori_loop. However, lax.scan or lax.fori_loop have some limitations like the generator/iterator doesn't work (See : jax-ml/jax#3567), so it is difficult to rewrite. Now, I'm struggling this issue and please confirm the more details in this issue deepchem#8

Next plan (during 2nd evaluation period)

According to the roadmap, I'm supposed to be working for implementing the CGCNN model. However, I will change this plan. In the next period, I will focus on resolving the performance issue and writing the documents. Please confirm the details below.

Build the JAXChem document (7/6 - 7/12)
Resolve the performance issue (7/13 - 7/20)
Resolve the performance issue (7/20 - 7/27)
Add more examples (7/27 - 8/3)

There are two reasons I will change the plan. First, the CGCNN model is similar to the sparse pattern model. Second, I seem that the crystal support of deepchem is currently too early stage and it still needs many fixes.
On the other hand, I will not change the plan (Implementing the Molecular Attention Transformer) about final evaluation period.

DeepChem

My official project is JAXChem, but I also have committed to DeepChem core codes. The reason is that JAXChem is one of the DeepChem projects. I think DeepChem core codes’ improvement is a really important for many users to know the JAXChem project and think they want to use it.

During 1st evaluation period, I mainly cleaned up old documentations or build systems. I listed up what I did in the details.

Fix a slow standup in colab
- PRs : Refactor Tutorial deepchem/deepchem#1789
- I replaced Anaconda to Miniconda
- I think user experiences improved in colab
Fix installation issues about some conda packages in colab
- PRs : install script for Google Colab deepchem/deepchem#1870 update tutorials deepchem/deepchem#1876
- Some conda packages like matraj or openmm couldn't be imported in colab
- I created an install script and I fixed this issue
Clean up the installation section in README
- PRs : Update Dockerfile and installation section in README.md deepchem/deepchem#1871 Update setup.py and README.md deepchem/deepchem#1889
- I updated soft dependency lists
- I specified how to install deeepchem using a conda/pip/docker
Clean up a build system
- PRs : Update setup.py and README.md deepchem/deepchem#1889 [Proposal] Cleanup requirements packages conda-forge/deepchem-feedstock#3
- I updated setup.py
- I discussed which conda-forge or deepchem channel is better for the future
- I updated the dependencies about the conda-forge/deepchem
Clean up a docker environment
- PRs : Update docker doc deepchem/deepchem#1917
- I added a new guide for installing deepchem using a docker
- I set up the automate building system for two images which deepchemio/deepchem provides
- I think docker images became more maintainable
Clean up the old website (deepchem.io)
- PRs : Update website deepchem/deepchem.io#25
- I remove the unused files and prevent users from visiting old docs
Review some PRs
- PRs : Overhauling Hyperparameter Optimization deepchem/deepchem#1878 [WIP] Adding inorganic crystal featurizers deepchem/deepchem#1916
- I could know some core codes more deeply

nissy-dev · 2020-06-29T16:32:26Z

Go to #3

nissy-dev closed this as completed Jun 29, 2020

nissy-dev mentioned this issue Jul 4, 2020

Daily Standup July #3

Closed

15 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Daily Standup in June #1

Daily Standup in June #1

nissy-dev commented Jun 2, 2020 •

edited

Loading

nissy-dev commented Jun 2, 2020 •

edited

Loading

nissy-dev commented Jun 4, 2020

nissy-dev commented Jun 6, 2020

nissy-dev commented Jun 8, 2020 •

edited

Loading

nissy-dev commented Jun 10, 2020 •

edited

Loading

nissy-dev commented Jun 11, 2020 •

edited

Loading

nissy-dev commented Jun 15, 2020 •

edited

Loading

nissy-dev commented Jun 17, 2020

nissy-dev commented Jun 19, 2020 •

edited

Loading

nissy-dev commented Jun 22, 2020

nissy-dev commented Jun 26, 2020 •

edited

Loading

nissy-dev commented Jun 29, 2020 •

edited

Loading

nissy-dev commented Jun 29, 2020 •

edited

Loading

nissy-dev commented Jun 29, 2020

Daily Standup in June #1

Daily Standup in June #1

Comments

nissy-dev commented Jun 2, 2020 • edited Loading

Daily standup in June

6/2

Yesterday

Today

Blockers

nissy-dev commented Jun 2, 2020 • edited Loading

6/3

Yesterday

required notebook

optional notebook

Today

Blockers

nissy-dev commented Jun 4, 2020

6/4

Yesterday

Today

Blockers

nissy-dev commented Jun 6, 2020

6/6

Yesterday

Today

Blockers

nissy-dev commented Jun 8, 2020 • edited Loading

6/8

Yesterday

Today

Blockers

nissy-dev commented Jun 10, 2020 • edited Loading

6/10

Yesterday (6/8)

Today

Blockers

nissy-dev commented Jun 11, 2020 • edited Loading

6/12

Yesterday

Today

blockers

nissy-dev commented Jun 15, 2020 • edited Loading

6/15

Yesterday

Today

blockers

nissy-dev commented Jun 17, 2020

6/17

Yesterday

Today

blockers

nissy-dev commented Jun 19, 2020 • edited Loading

6/19

Yesterday

Today

blockers

nissy-dev commented Jun 22, 2020

6/22

Yesterday

Today

blockers

nissy-dev commented Jun 26, 2020 • edited Loading

6/26

Yesterday

Today

blockers

nissy-dev commented Jun 29, 2020 • edited Loading

6/29

Yesterday

Today

blockers

nissy-dev commented Jun 29, 2020 • edited Loading

Summary for 1st evaluation period

JAXChem

Summary

Issues

Next plan (during 2nd evaluation period)

DeepChem

nissy-dev commented Jun 29, 2020

nissy-dev commented Jun 2, 2020 •

edited

Loading

nissy-dev commented Jun 2, 2020 •

edited

Loading

nissy-dev commented Jun 8, 2020 •

edited

Loading

nissy-dev commented Jun 10, 2020 •

edited

Loading

nissy-dev commented Jun 11, 2020 •

edited

Loading

nissy-dev commented Jun 15, 2020 •

edited

Loading

nissy-dev commented Jun 19, 2020 •

edited

Loading

nissy-dev commented Jun 26, 2020 •

edited

Loading

nissy-dev commented Jun 29, 2020 •

edited

Loading

nissy-dev commented Jun 29, 2020 •

edited

Loading