-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TensorflowJS conversion #4
Comments
Hi, instead of using the SavedModel, build the Here the code for conversion: import tensorflowjs as tfjs
from DTLN_model import DTLN_model
model_class = DTLN_model()
model_class.build_DTLN_model_stateful()
model_class.model.load_weights('./pretrained_model/model.h5')
tfjs.converters.save_keras_model(model_class.model, 'DTLN_js') This code returned fine on my system. Best, |
Thanks it works. Now in the process of changing the lambda layers to custom layer that can be loadable from js. It looks like there are 3 I need to port 2 lambda for fft and ifft and one normalization layer. Is that correct understanding ? |
Yes that is correct. |
In your last checkin I noticed you now have converted to tflite by splitting the model. Are you also considering tfjs model to run in a browser. I have started porting the layers but its going slow. Basically I have created 2 Sublayers to change the lambda class FFTlayer(tf.keras.layers.Layer):
class IFFTlayer(tf.keras.layers.Layer):
and now in the process of writing the serialization in javascript My question is are you considering a tfjsmodel as well ? |
I don’t have any experience with JavaScript, so I will probably not do it. I can port the model to ONNX similar to the tf lite model. ONNX also has JavaScript API and from my first look, the model must not be converted. The states and the fft must be handled outside the model similar to the tf lite model. But as I said, I don’t any experience whatsoever regarding signal processing in JS. |
Sounds good. Perhaps we can compare . I am getting familiar with the layers in your model. Actually in Tensorflowjs has all the necessary API's to perform the irfft and rfft so no real signal processing is necessary. Its just a matter of setting it up with the hooks for handling the serialization and correct shapes to model inputs at the various layers. I have never tired porting a custom layer so its going slow. For example JS code looks like this below. From the input layer to fft layer shape is not correct so experimenting with it. <title>DTLN TensorflowJS</title>
|
When I look at call(x)
{
console.log("Call Called for FFTLayer");
frame = tf.expand_dims(x, axis=1);
stft_dat = tf.signal.rfft(frame);
mag = tf.abs(stft_dat);
phase = tf.math.angle(stft_dat);
return [mag, phase]
}
computeOutputShape(inputShape)
{
console.log("Compute Output Shape!!!")
return [1,1,257];
} then probably the output shape is not correct. Let us try: computeOutputShape(inputShape)
{
console.log("Compute Output Shape!!!")
return [[1,1,257],[1,1,257]];
} |
Thanks that works. I can now load the fft layer. The output shape of the NormalizationLayer is (1, 1, 256) is this correct? |
Yes this is correct, but you can just use input shape as output shape, because the normalization layer does not change the shape. computeOutputShape(inputShape)
{
console.log("Compute Output Shape!!!")
return inputShape;
} |
Hi Nils, Question 2: the function separation kernal in DTLN Model you have a comment about not using Lambda. I suspect there is a typo i.e using lambda the weights dont get updated correctly ? Also would it work if I change in the python code and make it a layer instead of function call - I mean not Lambda but extend layer and convert it. I will try it and validate the model still works and correctly works but wanted to check first. |
Yes, you can use Instance normalization. It should do the exact same thing. I did not use the tfa layer, because I tried a lot around with the normalization.
You can just copy the layers from the function, if that is your question. But if you like you can also port the separation kernel to a custom layer, but I don't think thats a good idea, because it doesn't have any advantage. The function call was supposed to make model more readable and modular. |
I have been able to convert and port the appropriate layers in TFJS. Below is the summary loading it in browser. Looks good to me. What do you think? Will try inference on real audio data. [email protected]:17 Layer (type) Output shape Param # Receives inputs |
Well done! This looks really good! |
Thanks. next step would be to run inference with real time audio samples. One question I have is most of your examples for real-time are with saved model. However this tjfs conversion was on model.h5 I hope frame by frame procesaing would be possible on this keras model. I will try out and see how it behaves with random input tensor of size [1,512] before passing real audio. |
Did you set the stateful=True flag for the LSTMs? Whiteout that, block by block processing will not work. |
One question. In the statefulDTLN model you use fft layer and the other one u use STFT. The difference looks like the block shift. So is the idea of not using the shift in statful case because it's done outside the model ? I mean for real-time case it's done post inference blocks of 128 shifted and input data is also shifted. |
Hi Nils, I am getting this error when I run inference on the mode. Uncaught (in promise) Error: The support for CAUSAL padding mode in conv1dWithBias is not implemented yet. |
Yes the difference is the block shift. The STFT "layer" is well suited for training and for processing whole sequences. And also yes, it is easier to handle the shift outside the model during real time inference. I wanted to create a system, which works one block in one block out. The shift is optional. But for a model without shift the whole thing must be retrained.
Then I hope they will implement it soon. Padding "same" could probably also work, but I think the network must be retrained for that. |
Hi Nils, Also how real audio works with "same" padding is to be seen :) |
@shilsircar the latency needs to be less than 8ms for a 32ms block as @breizhn says in the Also, since you've got a working model, I have a couple of questions:
|
.h5 norm model
I didn't have to handle it outside. Tfjs handles sateful lstm [email protected] Trouble is TFJS team told me they don't have any immediate plans to implement conv1dwithbias and causal padding. I feel it's a bug since the last layer bias is false. |
OK. I was able to load the model too. I used the model.h5 without mag normalization. Below is the model summary:
|
How did you load the model in tfjs with all the lambda ? Based on your summary looks like it's keras model in python u loaded? In anycase the reason you have that is probably because on the output of the mag,phase lambda is not compatible. The output of fft is mag and phase each 1,1,257 .. see comments above. |
I just changed the names of the lambda classes in the model json to differentiate between the two Lambda layers as the weights are getting mixed up while loading. Yep, I was able to debug that error. But, now I'm getting an error in the instant_norm_layer custom layer |
Finally, I too was able to run it on realtime webbrowser!! But the average processing times are high for real time audio. I'm gettin an average of 13ms per a |
@vinod1234567890 that's good. Yeah it's a bit on the higher side and majority of the time is spent is in Lstm. Nothing on the js i can do. Will be trying wasm with SIMD soon. |
@shilsircar Clearer when compared to padding 'causal' you mean? these audio samples were cleaned in Python I assume? or in JS? Is the high inference time for the first few iterations related to webgl-shader warm-up? I too should look at threading using a webworker. |
looks like tfjs with webgl backend doesn't support webworkers like WebAudio API's audioworklet: correct me if I'm wrong |
webworker with offscreen canvas should work based on this https://developers.google.com/web/updates/2018/08/offscreen-canvas |
I also tried to quantize the tfjs model to float16 and uint8 resulting in an 2MB and a 900kb models. But for some reason, the inference time is exactly the same when compared to the full model. @breizhn Any idea why? Is it the number of OPs? I was assuming since tflite quantization increased speed, I would replicate it in tfjs |
The model had problems with bird chirps before, so that isn't a new behaviour. But it's cool to hear, that it also works with padding same! |
I noticed that before. Maybe it is something about prioritizing the process and at the first iteration, the processes for running the network are initialized. That takes some time I think. |
Maybe, the TFjs still calculates in float32. There are a lot of optimizations going on in tf-lite, like operator fusion and so on. Probably TFjs is not doing such stuff. |
@breizhn Is this correct translation for real-time use: I mean the block shift in audio real_time_dtln_audio looks like //set some parameters
} |
Yes, the shift is 8 ms and this is also the reason why the inference must be under 8 ms. For changing that the model must be retrained. A shift of 16 ms (256) will also work, if you retrain it, but it maybe has a bit decreased audio quality compared to the current version. |
Yeah you want maximum look ahead in the audio for best results for states. Makes sense... I have ran out of options now to bring tfjs below 8 ms consistently. Or for the 40 hr one is it the same training scripts and setup ? And the only thing I guess will change is shift parameter from 128 to 256? |
Yes, the only thing which changes is the shift. |
@breizhn |
Are you able to bring the tfjs inference to close to 8 ms? |
Without wasm SIMD tjfs won't get to 8ms or below. Audio is highly succeptible with delay than video ... So ideally processing time needs to be much smaller than 8 ms if 128 blockshift used .. |
Tfjs team responsed that they will be adding complex number support which will be great for this tensorflow/tfjs#3585 If anyone is open to producing a model with 16 and 24 ms shift model I can test the realtime inference. @breizhn |
@shilsircar, sadly I don't have the capacity at moment to train the networks. |
I just found that in tfjs, conv1d doesn't have an option to set usebias to false. It only supports usebias - true. |
That's correct @hchintada but you can still get reasonable results without it. The main issue is tjfs while ok for image is still not ready for real-time low latency such as audio. I am hoping with wasm SIMD it may be better but without complex numbers it's not going to work. |
@breizhn do you have any suggestion on data preparation to train 40hr norm model. I intend to try with 40 hrs hopefully 21 min per epoch reduced to 120 epoch might give satisfactory results. I intend to use the same DNS corpus data from your forked repo. My goal is to adapt to 24 ms latency. |
Closing this. The conclusion is DTLN is definitely portable for tfjs. And can me made to run completely in browser but not realtime in default configuration due to latency requirement of 8 ms cannot be achieved. Offline processing is possible and audio output is sufficiently clean. I am happy to write up instructions how to if anyone else is interested in experimenting. @breizhn thanks for all your help. |
Please do, it would be really useful. Now that SIMD is in play, experimentation on this becomes relevant again. |
@shilsircar can you post your js code for porting the custom layers or any other instructions? Do you have any other code for real time processing tests? Would like to experiment with this further, thanks! |
@shilsircar please can you share your js code for loading the model, and the custom layer creation code? We tried converting keras model to Tensorflow.js , and also created all custom layers, but when we try to load the model in browser , the javascripts goes to some infinite loop freezing the page. |
@shilsircar Please provide some instructions to your js code and tests to run on the browser for latency. Thank you. |
Is there any update regarding the instructions for converting? |
Hello everyone, I'm trying to integrate DTLN into my app too, but after conversion from h5 to model.json, when loading the model I got the same issue |
Sounds like it might be a bit like the tensorflow-lite conversion where the indexes get a bit confused. Maybe swap the indexes around as discussed here? |
@StuartIanNaylor I'm using this model.json file actually, and I can't seem to figure out how to visualize the input/output dims of all layers, because I can't even load the model from json file, it gives the error |
I am trying to convert the savedmodel using tensorflowjs 2.X using the following command:
tensorflowjs_converter --control_flow_v2=False --input_format=tf_saved_model --saved_model_tags=serve --signature_name=serving_default --strip_debug_ops=False --weight_shard_size_bytes=4194304 C:\Users\ss\Documents\workspace\DTLN\DTLN-master\pretrained_model\DTLN_norm_500h_saved_model C:\Users\ss\Documents\workspace\DTLN\tfjs
I get the following two errors:
Traceback (most recent call last):
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\importer.py", line 497, in _import_graph_def_internal
graph._c_graph, serialized, options) # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input 0 of node StatefulPartitionedCall/model/lstm/AssignVariableOp was passed float from Func/StatefulPartitionedCall/input/_4:0 incompatible with expected resource.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflowjs\converters\tf_saved_model_conversion_v2.py", line 482, in convert_tf_saved_model
frozen_graph = _freeze_saved_model_v2(concrete_func, control_flow_v2)
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflowjs\converters\tf_saved_model_conversion_v2.py", line 352, in _freeze_saved_model_v2
concrete_func, lower_control_flow=not control_flow_v2).graph
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\convert_to_constants.py", line 680, in convert_variables_to_constants_v2
return _construct_concrete_function(func, output_graph_def, converted_inputs)
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\convert_to_constants.py", line 406, in _construct_concrete_function
new_output_names)
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\eager\wrap_function.py", line 633, in function_from_graph_def
wrapped_import = wrap_function(_imports_graph_def, [])
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\eager\wrap_function.py", line 611, in wrap_function
collections={}),
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\func_graph.py", line 981, in func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\eager\wrap_function.py", line 86, in call
return self.call_with_variable_creator_scope(self._fn)(*args, **kwargs)
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\eager\wrap_function.py", line 92, in wrapped
return fn(*args, **kwargs)
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\eager\wrap_function.py", line 631, in _imports_graph_def
importer.import_graph_def(graph_def, name="")
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\importer.py", line 405, in import_graph_def
producer_op_list=producer_op_list)
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\importer.py", line 501, in _import_graph_def_internal
raise ValueError(str(e))
ValueError: Input 0 of node StatefulPartitionedCall/model/lstm/AssignVariableOp was passed float from Func/StatefulPartitionedCall/input/_4:0 incompatible with expected resource.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflowjs\converters\wizard.py", line 606, in run
converter.convert(arguments)
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflowjs\converters\converter.py", line 681, in convert
control_flow_v2=args.control_flow_v2)
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflowjs\converters\tf_saved_model_conversion_v2.py", line 485, in convert_tf_saved_model
output_node_names)
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflowjs\converters\tf_saved_model_conversion_v2.py", line 342, in _freeze_saved_model_v1
sess, g.as_graph_def(), output_node_names)
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\util\deprecation.py", line 324, in new_func
return func(*args, **kwargs)
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\graph_util_impl.py", line 359, in convert_variables_to_constants
inference_graph = extract_sub_graph(input_graph_def, output_node_names)
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\util\deprecation.py", line 324, in new_func
return func(*args, **kwargs)
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\graph_util_impl.py", line 205, in extract_sub_graph
_assert_nodes_are_present(name_to_node, dest_nodes)
File "c:\users\ss\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\graph_util_impl.py", line 160, in _assert_nodes_are_present
assert d in name_to_node, "%s is not in graph" % d
AssertionError: Identity is not in graph
The text was updated successfully, but these errors were encountered: