Skip to content

Commit b90d61a

Browse files
authored
2.7.0 release (#3669)
1 parent 0a43851 commit b90d61a

File tree

169 files changed

+29795
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

169 files changed

+29795
-0
lines changed

cpu/2.7.0+cpu/.buildinfo

+4
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# Sphinx build info version 1
2+
# This file records the configuration used when building these files. When it is not found, a full rebuild will be done.
3+
config: fa8b07ca4c732db050ee9c1a7147397c
4+
tags: 645f666f9bcd5a90fca523b33c5a78b7

cpu/2.7.0+cpu/_images/1ins_cus.gif

212 KB
Loading

cpu/2.7.0+cpu/_images/1ins_log.gif

385 KB
Loading

cpu/2.7.0+cpu/_images/1ins_phy.gif

308 KB
Loading

cpu/2.7.0+cpu/_images/1ins_soc.gif

227 KB
Loading

cpu/2.7.0+cpu/_images/GenAI-bf16.gif

702 KB
Loading

cpu/2.7.0+cpu/_images/GenAI-int8.gif

717 KB
Loading
12.5 MB
Loading
13.2 MB
Loading

cpu/2.7.0+cpu/_images/bf16_llama.gif

842 KB
Loading
Loading
9.66 KB
Loading
56 KB
Loading
3.76 KB
Loading

cpu/2.7.0+cpu/_images/hypertune.png

103 KB
Loading
28.5 KB
Loading
Loading
27.2 KB
Loading

cpu/2.7.0+cpu/_images/llm_iakv_1.png

25 KB
Loading

cpu/2.7.0+cpu/_images/llm_iakv_2.png

31.1 KB
Loading
35.9 KB
Loading
38.5 KB
Loading
37.5 KB
Loading

cpu/2.7.0+cpu/_images/nins_cus.gif

365 KB
Loading

cpu/2.7.0+cpu/_images/nins_lat.gif

534 KB
Loading

cpu/2.7.0+cpu/_images/nins_thr.gif

319 KB
Loading
1.82 MB
Loading

cpu/2.7.0+cpu/_images/split_sgd.png

12.7 KB
Loading
30.7 KB
Loading
1.16 MB
Loading
1.21 MB
Loading
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Intel® Extension for PyTorch\* CPU ISA Dynamic Dispatch Design Doc
2+
3+
The design document has been merged with [the ISA Dynamic Dispatch feature introduction](../../tutorials/features/isa_dynamic_dispatch.md).

cpu/2.7.0+cpu/_sources/index.rst.txt

+100
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
.. meta::
2+
:description: This website introduces Intel® Extension for PyTorch*
3+
:keywords: Intel optimization, PyTorch, Intel® Extension for PyTorch*, GPU, discrete GPU, Intel discrete GPU
4+
5+
Intel® Extension for PyTorch*
6+
#############################
7+
8+
Intel® Extension for PyTorch* extends PyTorch* with the latest performance optimizations for Intel hardware.
9+
Optimizations take advantage of Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Vector Neural Network Instructions (VNNI) and Intel® Advanced Matrix Extensions (Intel® AMX) on Intel CPUs as well as Intel X\ :sup:`e`\ Matrix Extensions (XMX) AI engines on Intel discrete GPUs.
10+
Moreover, Intel® Extension for PyTorch* provides easy GPU acceleration for Intel discrete GPUs through the PyTorch* ``xpu`` device.
11+
12+
In the current technological landscape, Generative AI (GenAI) workloads and models have gained widespread attention and popularity. Large Language Models (LLMs) have emerged as the dominant models driving these GenAI applications. Starting from 2.1.0, specific optimizations for certain
13+
LLMs are introduced in the Intel® Extension for PyTorch*. For more information on LLM optimizations, refer to the `Large Language Models (LLM) <tutorials/llm.html>`_ section.
14+
15+
The extension can be loaded as a Python module for Python programs or linked as a C++ library for C++ programs. In Python scripts, users can enable it dynamically by importing ``intel_extension_for_pytorch``.
16+
17+
.. note::
18+
19+
- GPU features are not included in CPU-only packages.
20+
- Optimizations for CPU-only may have a newer code base due to different development schedules.
21+
22+
Intel® Extension for PyTorch* has been released as an open–source project at `Github <https://github.com/intel/intel-extension-for-pytorch>`_. You can find the source code and instructions on how to get started at:
23+
24+
- **CPU**: `CPU main branch <https://github.com/intel/intel-extension-for-pytorch/tree/main>`_ | `Quick Start <https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/getting_started>`_
25+
- **XPU**: `XPU main branch <https://github.com/intel/intel-extension-for-pytorch/tree/xpu-main>`_ | `Quick Start <https://intel.github.io/intel-extension-for-pytorch/xpu/latest/tutorials/getting_started>`_
26+
27+
You can find more information about the product at:
28+
29+
- `Features <https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/features>`_
30+
- `Performance <https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/performance>`_
31+
32+
Architecture
33+
------------
34+
35+
Intel® Extension for PyTorch* is structured as shown in the following figure:
36+
37+
.. figure:: ../images/intel_extension_for_pytorch_structure.png
38+
:width: 800
39+
:align: center
40+
:alt: Architecture of Intel® Extension for PyTorch*
41+
42+
Architecture of Intel® Extension for PyTorch*
43+
44+
- **Eager Mode**: In the eager mode, the PyTorch frontend is extended with custom Python modules (such as fusion modules), optimal optimizers, and INT8 quantization APIs. Further performance improvement is achieved by converting eager-mode models into graph mode using extended graph fusion passes.
45+
- **Graph Mode**: In the graph mode, fusions reduce operator/kernel invocation overhead, resulting in improved performance. Compared to the eager mode, the graph mode in PyTorch* normally yields better performance from the optimization techniques like operation fusion. Intel® Extension for PyTorch* amplifies them with more comprehensive graph optimizations. Both PyTorch ``Torchscript`` and ``TorchDynamo`` graph modes are supported. With ``Torchscript``, we recommend using ``torch.jit.trace()`` as your preferred option, as it generally supports a wider range of workloads compared to ``torch.jit.script()``. With ``TorchDynamo``, ipex backend is available to provide good performances.
46+
- **CPU Optimization**: On CPU, Intel® Extension for PyTorch* automatically dispatches operators to underlying kernels based on detected instruction set architecture (ISA). The extension leverages vectorization and matrix acceleration units available on Intel hardware. The runtime extension offers finer-grained thread runtime control and weight sharing for increased efficiency.
47+
- **GPU Optimization**: On GPU, optimized operators and kernels are implemented and registered through PyTorch dispatching mechanism. These operators and kernels are accelerated from native vectorization feature and matrix calculation feature of Intel GPU hardware. Intel® Extension for PyTorch* for GPU utilizes the `DPC++ <https://github.com/intel/llvm#oneapi-dpc-compiler>`_ compiler that supports the latest `SYCL* <https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html>`_ standard and also a number of extensions to the SYCL* standard, which can be found in the `sycl/doc/extensions <https://github.com/intel/llvm/tree/sycl/sycl/doc/extensions>`_ directory.
48+
49+
50+
Support
51+
-------
52+
The team tracks bugs and enhancement requests using `GitHub issues <https://github.com/intel/intel-extension-for-pytorch/issues/>`_. Before submitting a suggestion or bug report, search the existing GitHub issues to see if your issue has already been reported.
53+
54+
.. toctree::
55+
:caption: ABOUT
56+
:maxdepth: 3
57+
:hidden:
58+
59+
tutorials/introduction
60+
tutorials/features
61+
Large Language Models (LLM)<tutorials/llm>
62+
tutorials/performance
63+
tutorials/releases
64+
tutorials/known_issues
65+
tutorials/blogs_publications
66+
tutorials/license
67+
68+
.. toctree::
69+
:maxdepth: 3
70+
:caption: GET STARTED
71+
:hidden:
72+
73+
tutorials/installation
74+
tutorials/getting_started
75+
tutorials/examples
76+
tutorials/cheat_sheet
77+
78+
.. toctree::
79+
:maxdepth: 3
80+
:caption: DEVELOPER REFERENCE
81+
:hidden:
82+
83+
tutorials/api_doc
84+
85+
.. toctree::
86+
:maxdepth: 3
87+
:caption: PERFORMANCE TUNING
88+
:hidden:
89+
90+
tutorials/performance_tuning/tuning_guide
91+
tutorials/performance_tuning/launch_script
92+
tutorials/performance_tuning/torchserve
93+
94+
.. toctree::
95+
:maxdepth: 3
96+
:caption: CONTRIBUTING GUIDE
97+
:hidden:
98+
99+
tutorials/contribution
100+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
API Documentation
2+
#################
3+
4+
General
5+
*******
6+
7+
`ipex.optimize` is generally used for generic PyTorch models.
8+
9+
.. automodule:: intel_extension_for_pytorch
10+
.. autofunction:: optimize
11+
12+
13+
`ipex.llm.optimize` is used for Large Language Models (LLM).
14+
15+
.. automodule:: intel_extension_for_pytorch.llm
16+
.. autofunction:: optimize
17+
18+
.. currentmodule:: intel_extension_for_pytorch
19+
.. autoclass:: verbose
20+
21+
LLM Module Level Optimizations (Prototype)
22+
******************************************
23+
24+
Module level optimization APIs are provided for optimizing customized LLMs.
25+
26+
.. automodule:: intel_extension_for_pytorch.llm.modules
27+
.. autoclass:: LinearSilu
28+
29+
.. currentmodule:: intel_extension_for_pytorch.llm.modules
30+
.. autoclass:: LinearSiluMul
31+
32+
.. currentmodule:: intel_extension_for_pytorch.llm.modules
33+
.. autoclass:: Linear2SiluMul
34+
35+
.. currentmodule:: intel_extension_for_pytorch.llm.modules
36+
.. autoclass:: LinearRelu
37+
38+
.. currentmodule:: intel_extension_for_pytorch.llm.modules
39+
.. autoclass:: LinearNewGelu
40+
41+
.. currentmodule:: intel_extension_for_pytorch.llm.modules
42+
.. autoclass:: LinearGelu
43+
44+
.. currentmodule:: intel_extension_for_pytorch.llm.modules
45+
.. autoclass:: LinearMul
46+
47+
.. currentmodule:: intel_extension_for_pytorch.llm.modules
48+
.. autoclass:: LinearAdd
49+
50+
.. currentmodule:: intel_extension_for_pytorch.llm.modules
51+
.. autoclass:: LinearAddAdd
52+
53+
.. currentmodule:: intel_extension_for_pytorch.llm.modules
54+
.. autoclass:: RotaryEmbedding
55+
56+
.. currentmodule:: intel_extension_for_pytorch.llm.modules
57+
.. autoclass:: RMSNorm
58+
59+
.. currentmodule:: intel_extension_for_pytorch.llm.modules
60+
.. autoclass:: FastLayerNorm
61+
62+
.. currentmodule:: intel_extension_for_pytorch.llm.modules
63+
.. autoclass:: IndirectAccessKVCacheAttention
64+
65+
.. currentmodule:: intel_extension_for_pytorch.llm.modules
66+
.. autoclass:: PagedAttention
67+
68+
.. currentmodule:: intel_extension_for_pytorch.llm.modules
69+
.. autoclass:: VarlenAttention
70+
71+
.. automodule:: intel_extension_for_pytorch.llm.functional
72+
.. autofunction:: rotary_embedding
73+
74+
.. currentmodule:: intel_extension_for_pytorch.llm.functional
75+
.. autofunction:: rms_norm
76+
77+
.. currentmodule:: intel_extension_for_pytorch.llm.functional
78+
.. autofunction:: fast_layer_norm
79+
80+
.. currentmodule:: intel_extension_for_pytorch.llm.functional
81+
.. autofunction:: indirect_access_kv_cache_attention
82+
83+
.. currentmodule:: intel_extension_for_pytorch.llm.functional
84+
.. autofunction:: varlen_attention
85+
86+
Fast Bert (Prototype)
87+
************************
88+
89+
.. currentmodule:: intel_extension_for_pytorch
90+
.. autofunction:: fast_bert
91+
92+
Graph Optimization
93+
******************
94+
95+
.. currentmodule:: intel_extension_for_pytorch
96+
.. autofunction:: enable_onednn_fusion
97+
98+
Quantization
99+
************
100+
101+
.. automodule:: intel_extension_for_pytorch.quantization
102+
.. autofunction:: get_weight_only_quant_qconfig_mapping
103+
.. autofunction:: prepare
104+
.. autofunction:: convert
105+
106+
Prototype API, introduction is avaiable at `feature page <./features/int8_recipe_tuning_api.md>`_.
107+
108+
.. autofunction:: autotune
109+
110+
CPU Runtime
111+
***********
112+
113+
.. automodule:: intel_extension_for_pytorch.cpu.runtime
114+
.. autofunction:: is_runtime_ext_enabled
115+
.. autoclass:: CPUPool
116+
.. autoclass:: pin
117+
.. autoclass:: MultiStreamModuleHint
118+
.. autoclass:: MultiStreamModule
119+
.. autoclass:: Task
120+
.. autofunction:: get_core_list_of_node_id
121+
122+
.. .. automodule:: intel_extension_for_pytorch.quantization
123+
.. :members:
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
Blogs & Publications
2+
====================
3+
4+
* [Accelerate Llama 2 with Intel AI Hardware and Software Optimizations, Jul 2023](https://www.intel.com/content/www/us/en/developer/articles/news/llama2.html)
5+
* [Accelerate PyTorch\* Training and Inference Performance using Intel® AMX, Jul 2023](https://www.intel.com/content/www/us/en/developer/articles/technical/accelerate-pytorch-training-inference-on-amx.html)
6+
* [Intel® Deep Learning Boost (Intel® DL Boost) - Improve Inference Performance of Hugging Face BERT Base Model in Google Cloud Platform (GCP) Technology Guide, Apr 2023](https://networkbuilders.intel.com/solutionslibrary/intel-deep-learning-boost-intel-dl-boost-improve-inference-performance-of-hugging-face-bert-base-model-in-google-cloud-platform-gcp-technology-guide)
7+
* [Get Started with Intel® Extension for PyTorch\* on GPU | Intel Software, Mar 2023](https://www.youtube.com/watch?v=Id-rE2Q7xZ0&t=1s)
8+
* [Accelerate PyTorch\* INT8 Inference with New “X86” Quantization Backend on X86 CPUs, Mar 2023](https://www.intel.com/content/www/us/en/developer/articles/technical/accelerate-pytorch-int8-inf-with-new-x86-backend.html)
9+
* [Accelerating PyTorch Transformers with Intel Sapphire Rapids, Part 1, Jan 2023](https://huggingface.co/blog/intel-sapphire-rapids)
10+
* [Intel® Deep Learning Boost - Improve Inference Performance of BERT Base Model from Hugging Face for Network Security Technology Guide, Jan 2023](https://networkbuilders.intel.com/solutionslibrary/intel-deep-learning-boost-improve-inference-performance-of-bert-base-model-from-hugging-face-for-network-security-technology-guide)
11+
* [Scaling inference on CPUs with TorchServe, PyTorch Conference, Dec 2022](https://www.youtube.com/watch?v=066_Jd6cwZg)
12+
* [What is New in Intel Extension for PyTorch, PyTorch Conference, Dec 2022](https://www.youtube.com/watch?v=SE56wFXdvP4&t=1s)
13+
* [Accelerating PyG on Intel CPUs, Dec 2022](https://www.pyg.org/ns-newsarticle-accelerating-pyg-on-intel-cpus)
14+
* [Accelerating PyTorch Deep Learning Models on Intel XPUs, Dec, 2022](https://www.oneapi.io/event-sessions/accelerating-pytorch-deep-learning-models-on-intel-xpus-2-ai-hpc-2022/)
15+
* [Introducing the Intel® Extension for PyTorch\* for GPUs, Dec 2022](https://www.intel.com/content/www/us/en/developer/articles/technical/introducing-intel-extension-for-pytorch-for-gpus.html)
16+
* [PyTorch Stable Diffusion Using Hugging Face and Intel Arc, Nov 2022](https://towardsdatascience.com/pytorch-stable-diffusion-using-hugging-face-and-intel-arc-77010e9eead6)
17+
* [PyTorch 1.13: New Potential for AI Developers to Enhance Model Performance and Accuracy, Nov 2022](https://www.intel.com/content/www/us/en/developer/articles/technical/pytorch-1-13-new-potential-for-ai-developers.html)
18+
* [Easy Quantization in PyTorch Using Fine-Grained FX, Sep 2022](https://medium.com/intel-analytics-software/easy-quantization-in-pytorch-using-fine-grained-fx-80be2c4bc2d6)
19+
* [Empowering PyTorch on Intel® Xeon® Scalable processors with Bfloat16, Aug 2022](https://pytorch.org/blog/empowering-pytorch-on-intel-xeon-scalable-processors-with-bfloat16/)
20+
* [Accelerating PyTorch Vision Models with Channels Last on CPU, Aug 2022](https://pytorch.org/blog/accelerating-pytorch-vision-models-with-channels-last-on-cpu/)
21+
* [One-Click Enabling of Intel Neural Compressor Features in PyTorch Scripts, Aug 2022](https://medium.com/intel-analytics-software/one-click-enable-intel-neural-compressor-features-in-pytorch-scripts-5d4e31f5a22b)
22+
* [Increase PyTorch Inference Throughput by 4x, Jul 2022](https://www.intel.com/content/www/us/en/developer/articles/technical/increase-pytorch-inference-throughput-by-4x.html)
23+
* [PyTorch Inference Acceleration with Intel® Neural Compressor, Jun 2022](https://medium.com/pytorch/pytorch-inference-acceleration-with-intel-neural-compressor-842ef4210d7d)
24+
* [Accelerating PyTorch with Intel® Extension for PyTorch, May 2022](https://medium.com/pytorch/accelerating-pytorch-with-intel-extension-for-pytorch-3aef51ea3722)
25+
* [Grokking PyTorch Intel CPU performance from first principles (parts 1), Apr 2022](https://pytorch.org/tutorials/intermediate/torchserve_with_ipex.html)
26+
* [Grokking PyTorch Intel CPU performance from first principles (parts 2), Apr 2022](https://pytorch.org/tutorials/intermediate/torchserve_with_ipex_2.html)
27+
* [Grokking PyTorch Intel CPU performance from first principles, Apr 2022](https://medium.com/pytorch/grokking-pytorch-intel-cpu-performance-from-first-principles-7e39694412db)
28+
* [KT Optimizes Performance for Personalized Text-to-Speech, Nov 2021](https://community.intel.com/t5/Blogs/Tech-Innovation/Artificial-Intelligence-AI/KT-Optimizes-Performance-for-Personalized-Text-to-Speech/post/1337757)
29+
* [Accelerating PyTorch distributed fine-tuning with Intel technologies, Nov 2021](https://huggingface.co/blog/accelerating-pytorch)
30+
* [Scaling up BERT-like model Inference on modern CPU - parts 1, Apr 2021](https://huggingface.co/blog/bert-cpu-scaling-part-1)
31+
* [Scaling up BERT-like model Inference on modern CPU - parts 2, Nov 2021](https://huggingface.co/blog/bert-cpu-scaling-part-2)
32+
* [NAVER: Low-Latency Machine-Learning Inference](https://www.intel.com/content/www/us/en/customer-spotlight/stories/naver-ocr-customer-story.html)
33+
* [Intel® Extensions for PyTorch, Feb 2021](https://pytorch.org/tutorials/recipes/recipes/intel_extension_for_pytorch.html)
34+
* [Optimizing DLRM by using PyTorch with oneCCL Backend, Feb 2021](https://pytorch.medium.com/optimizing-dlrm-by-using-pytorch-with-oneccl-backend-9f85b8ef6929)
35+
* [Accelerate PyTorch with IPEX and oneDNN using Intel BF16 Technology, Feb 2021](https://medium.com/pytorch/accelerate-pytorch-with-ipex-and-onednn-using-intel-bf16-technology-dca5b8e6b58f)
36+
*Note*: APIs mentioned in it are deprecated.
37+
* [Intel and Facebook Accelerate PyTorch Performance with 3rd Gen Intel® Xeon® Processors and Intel® Deep Learning Boost’s new BFloat16 capability, Jun 2020](https://community.intel.com/t5/Blogs/Tech-Innovation/Artificial-Intelligence-AI/Intel-and-Facebook-Accelerate-PyTorch-Performance-with-3rd-Gen/post/1335659)
38+
* [Intel and Facebook\* collaborate to boost PyTorch\* CPU performance, Apr 2019](https://www.intel.com/content/www/us/en/developer/articles/case-study/intel-and-facebook-collaborate-to-boost-pytorch-cpu-performance.html)
39+
* [Intel and Facebook\* Collaborate to Boost Caffe\*2 Performance on Intel CPU’s, Apr 2017](https://www.intel.com/content/www/us/en/developer/articles/technical/intel-and-facebook-collaborate-to-boost-caffe2-performance-on-intel-cpu-s.html)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
Cheat Sheet
2+
===========
3+
4+
Get started with Intel® Extension for PyTorch\* using the following commands:
5+
6+
|Description | Command |
7+
| -------- | ------- |
8+
| Basic CPU Installation | `python -m pip install intel_extension_for_pytorch` |
9+
| Import Intel® Extension for PyTorch\* | `import intel_extension_for_pytorch as ipex`|
10+
| Capture a Verbose Log (Command Prompt) | `export ONEDNN_VERBOSE=1` |
11+
| Optimization During Training | `model = ...`<br>`optimizer = ...`<br>`model.train()`<br>`model, optimizer = ipex.optimize(model, optimizer=optimizer)`|
12+
| Optimization During Inference | `model = ...`<br>`model.eval()`<br>`model = ipex.optimize(model)` |
13+
| Optimization Using the Low-Precision Data Type bfloat16 <br>During Training (Default FP32) | `model = ...`<br>`optimizer = ...`<br>`model.train()`<br/><br/>`model, optimizer = ipex.optimize(model, optimizer=optimizer, dtype=torch.bfloat16)`<br/><br/>`with torch.no_grad():`<br>` with torch.cpu.amp.autocast():`<br>` model(data)` |
14+
| Optimization Using the Low-Precision Data Type bfloat16 <br>During Inference (Default FP32) | `model = ...`<br>`model.eval()`<br/><br/>`model = ipex.optimize(model, dtype=torch.bfloat16)`<br/><br/>`with torch.cpu.amp.autocast():`<br>` model(data)`
15+
| [Prototype] Fast BERT Optimization | `from transformers import BertModel`<br>`model = BertModel.from_pretrained("bert-base-uncased")`<br>`model.eval()`<br/><br/>`model = ipex.fast_bert(model, dtype=torch.bfloat16)`|
16+
| Run CPU Launch Script (Command Prompt): <br>Automate Configuration Settings for Performance | `ipexrun [knobs] <your_pytorch_script> [args]`|
17+
| [Prototype] Run HyperTune to perform hyperparameter/execution configuration search | `python -m intel_extension_for_pytorch.cpu.hypertune --conf-file <your_conf_file> <your_python_script> [args]`|
18+
| [Prototype] Enable Graph capture | `model = …`<br>`model.eval()`<br>`model = ipex.optimize(model, graph_mode=True)`|
19+
| Post-Training INT8 Quantization (Static) | `model = …`<br>`model.eval()`<br>`data = …`<br/><br/>`qconfig = ipex.quantization.default_static_qconfig`<br/><br/>`prepared_model = ipex.quantization.prepare(model, qconfig, example_inputs=data, anyplace=False)`<br/><br/>`for d in calibration_data_loader():`<br>` prepared_model(d)`<br/><br/>`converted_model = ipex.quantization.convert(prepared_model)`|
20+
| Post-Training INT8 Quantization (Dynamic) | `model = …`<br>`model.eval()`<br>`data = …`<br/><br/>`qconfig = ipex.quantization.default_dynamic_qconfig`<br/><br/>`prepared_model = ipex.quantization.prepare(model, qconfig, example_inputs=data)`<br/><br/>`converted_model = ipex.quantization.convert(prepared_model)` |
21+
| [Prototype] Post-Training INT8 Quantization (Tuning Recipe): | `model = …`<br>`model.eval()`<br>`data = …`<br/><br/>`qconfig = ipex.quantization.default_static_qconfig`<br/><br/>`prepared_model = ipex.quantization.prepare(model, qconfig, example_inputs=data, inplace=False)`<br/><br/>`tuned_model = ipex.quantization.autotune(prepared_model, calibration_data_loader, eval_function, sampling_sizes=[100],`<br>` accuracy_criterion={'relative': .01}, tuning_time=0)`<br/><br/>`convert_model = ipex.quantization.convert(tuned_model)`|

0 commit comments

Comments
 (0)