-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented Parallel Download of Forms using Multi-Threading and Locks #60
base: develop
Are you sure you want to change the base?
Conversation
…l and Added an Initial-Setup Guide.
try { | ||
installEverything(tempMediaPath, fileResult, parsedFields, formsDirPath, newAttachmentsDetected); | ||
lock.unlock(); | ||
} catch (FormDownloadException.DiskError e) { | ||
cleanUp(fileResult, tempMediaPath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You haven't released the lock in catch. This could lead to a deadlock if an exception occurs. This also needs to be handled in other catch blocks as well.
|
||
index++; | ||
} | ||
executorService.shutdown(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
executorService may shut down before all the threads have completed work, it is not a blocking method. In such a scenario results might not contain all the forms data.
new Handler(Looper.getMainLooper()).post(new Runnable() { | ||
@Override | ||
public void run() { | ||
results.put(serverFormDetails, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HashMaps are not thread safe.
} | ||
String currentFormNumber = String.valueOf(index); | ||
String totalForms = String.valueOf(values[0].size()); | ||
publishProgress(serverFormDetails.getFormName(), currentFormNumber, totalForms); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this process is not synchronous now, currentFormNumber will return wrong data when used in a concurrent threaded system.
// download media files if there are any | ||
lock.lock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've acquired the lock at the beginning of the process and released the lock at the end of the download process. What this essentially does is:
- A new thread acquires the lock to download form media.
- The thread downloads the form media by making a web request.
- The thread releases the lock.
How is different than synchronous processing if the other threads are anyways waiting for the lock to be released while the download process is on?
@@ -94,8 +98,8 @@ private void processOneForm(ServerFormDetails fd, OngoingWorkListener stateListe | |||
// get the xml file | |||
// if we've downloaded a duplicate, this gives us the file | |||
fileResult = downloadXform(fd.getFormName(), fd.getDownloadUrl(), stateListener, tempDir, formsDirPath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this method is thread safe? If so, how?
@chinmoy12c These changes should make the media downloads also parallel and should properly shut down the executorservice and I have also released the locks in the catch blocks also. Also used locks in the downloadXForm function so that its thread safe. Is this correct ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also find a way to test these with a large number of forms. Could you set up such an environment and share the test results?
import org.odk.collect.forms.FormSourceException | ||
import org.odk.collect.forms.FormsRepository | ||
import org.odk.collect.forms.MediaFile | ||
import org.odk.collect.forms.* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove *
imports and use specific imports.
|
||
class FormMediaDownloader( | ||
private val formsRepository: FormsRepository, | ||
private val formSource: FormSource | ||
) { | ||
|
||
private val lock: Lock = ReentrantLock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, add a newline.
executorService.shutdown(); | ||
// Wait for all tasks to complete | ||
try { | ||
executorService.awaitTermination(1, TimeUnit.MINUTES); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the basis for choosing 1 minute for timeout? If there are 500 forms, will those be completed in 500 forms?
String totalForms = String.valueOf(values[0].size()); | ||
publishProgress(serverFormDetails.getFormName(), currentFormNumber, totalForms); | ||
|
||
Thread thread = new Thread() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are we limiting resource consumption on this? If there are a large number of forms let's say 1000 forms, this would open 1000 threads which is not exactly resource friendly. We should provide a configurable way to limit the number of threads.
@chinmoy12c @charanpreet-samagra I have set up my central project with 100 forms to simulate a demanding task. The proposed solution completes the download of all the forms in roughly 2 seconds (where download start time = 03:40:23.052 and end time = 03:40:25.000) When the number of forms was increased in the project to 514. The proposed solution completes the download of all the forms in roughly 8 seconds (where download start time = 15:04:27.172 and end time = 15:04:35.285) All these forms were simple forms with only one input field and no media attached. The bulk upload was done using ODK Central API and a self-written python script |
This is great @prabs3257. Could you make the final changes that I requested? |
Closes #7
What has been done to verify that this works as intended?
Tested the code by artificially increasing the download time by adding delays, results show that the code works as intended. Verified that all the form files are being downloaded and saved properly.
Why is this the best possible solution? Were any other approaches considered?
Increases the performance of the app exponentially.
How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?
This change does not affect the user
Do we need any specific form for testing your changes? If so, please attach one.
No
Does this change require updates to documentation? If so, please file an issue here and include the link below.
No
Before submitting this PR, please make sure you have:
./gradlew checkAll
and confirmed all checks still pass OR confirm CircleCI build passes and run./gradlew connectedDebugAndroidTest
locally.