Releases: Xilinx/Vitis_Libraries
2022.2 Update 1
Vitis Ultrasound Library Updates:
Vitis Ultrasound library, which targets on the platform Versal AI Core Series VCK190 evaluation board, provides implementation of different L1/L2/L3 APIs as a toolbox for ultrasound image processing. Current version provides:
- L1, the lowest level of abstraction and is composed of simple BLAS operation.
- L2, the functional units of the Beamformer, which can be obtained by composing L1 libraries.
- L3, complete Beamformer which uses all of the three points above and contain run tests for beamforming design of PW/SA/Scanline
2022.2 Release
Vitis Data Analytics Library
Added the following API:
- String LIKE: API returns true if the string matches the supplied pattern, similar to string find in C++. The NOT LIKE expression returns false if LIKE returns true.
- String EQUAL: API returns true if the string completely matches with the base string, similar to string compare in C++. The NOT EQUAL expression returns false if EQUAL returns true.
- JSONLine Loader: API is enhanced to support more general data type, including nested field and list.
Vitis DSP Library
The following features have been added to the DSP library:
- FFT Window: New library element. FFT Window is a utility to apply a windowing (scaling) function such as Hamming to a frame of data samples.
- FFT/iFFT: FFT Dynamic Point Size (run-time point size determination) is now supported with parallelized configurations.
- FIR Filters: All FIR library elements (with the exception of FIR Resampler) now support Super Sample Rate operation for higher throughput. To minimize latency, Super Sample Rate operation is implemented using streaming interfaces. In addition, usage of Input Window Size (TP_INPUT_WINDOW_VSIZE) parameter has been consolidated across library. TP_INPUT_WINDOW_VSIZE describes the number of samples processed by the graph in a single iteration run. Reloadable coefficients within the Super Sample Rate configurations are now supported on all FIR variants that support SSR operation.
Vitis Solver Library
Added support for two API on AI Engine:
- QRF (QR decomposition)
- Cholesky decomposition
Vitis Utility Library
Added support for 4D datamover on AI Engine. 4D Datamover takes a queue of 9x64bits descriptors as input to describe a 4D access pattern. It reads the 4D cuboid with the desired pattern and finishes descriptors one by one.
- read4D
- write4D
Vitis Vision Library
PL Additions and Updates:
- New functions:
- HDR Decompanding: Compress(compand) data in a piece-wise linear (PWL) mapping to a lower bit depth
- Degamma: Designed to linearize the input from sensor or any pre-processing IP
- ISPStats: collects histogram based stats of bayer and color images
- ISP all-in-one pipeline: All the ISP related functions stitched in one pipeline with option to exclude unwanted functions during runtime and compile time.
- Multi-stream ISP: Multiple input stream ISP pipeline
- Updates:
- Added new template parameter XFCVDEPTH for xf::cv::Mat class that can be used to assign custom depth to the Mat’s internal hls::stream.
- All APIs in the library updated with newly added XFCVDEPTH parameter for xf::cv::Mat
- Remove deprecated SDSVHLS macro from all files
- Replaced deprecated RESOURCE pragma with BIND_STORAGE/BIND_OP pragmas
- Rename NO, RO in all files to SPC (Single Pixel per Clock) and MPC (Multiple Pixels per Clock)
- Add missing reference functions in L1, L2, L3 testbench files
- Fixed Gaussian Difference incorrect implementation
- Fixed incorrect dst Mat assignment in xf::cv::Mat member function convertBitdepth
- Updated analyzeDiff in L1/include/common/xf_sw_utils.hpp to a static function
- Added missing “Test Passed/Failed/Finished” check in all L1/L2/L3 functions.
- Added 16 bit and 4 channel support, corrected B and R channel swap issue for channel extract function.
- Fixed a bug in BGR2HLS module of cvtcolor function
- Restructured L1 channel combine accel and testbench code
- Fixed SVM emulation and cosim hang issue
- Updated loop tripcounts of pyrDown, histogram, HDR extract, rgb2yuyv module in cvtColor to fix synthesis latency numbers
- Fixed array reshape pragma in xf_sobel.hpp, xf_video_mem.hpp files
- Lib Infra Changes:
- Added frequency setting in L2/L3 JSON files. 300 MHz for NPPC1 and 150MHz for NPPC8 for most cases.
- Updated JSON and Makefiles to use ps_on_x86 feature for software emulation targeting embedded platforms. Software emulation for embedded platforms no longer uses qemu—only the regular g++ compilation flow.
- Added missing environment checks in all JSON and Makefiles.
AI Engine Additions and Updates:
- New functions:
- Resize / Resize + Normalize
- Smart tiling for x86 64-bit platforms
- Updates:
- RTL Data movers
- 8-bit PL / 8-bit AIE data movers
- Multi-channel support
- Optimized implementation
- Optimized smart tiling / stitching for higher performance
- Fix Random crashes in hardware emulation flow
- Miscellaneous bug fixes
Known issues
- Vitis GUI projects on RHEL83 and CEntOS82 may fail because of a lib conflict in the LD_LIBRARY_PATH setting. User needs to remove ${env_var:LD_LIBRARY_PATH} from the project environment settings for the function to build successfully.
- rgbir2bayer, isppipeline_rgbir PL functions are not supplied with input images.
- Software emulation for Warptransform L2 testcases doesn’t work because of a known issue with platform.
- Warptransform L1 URAM cases fail CSim because of a known HLS issue.
2022.1 Update 3
Update
- DSP library updates:
- Fix for Matrix Multiply Untiler logic wrt 256-bit vector sizes
2022.1 Update 2
Update
- Vision library updates:
- Added support for custom depth specification for xf::cv::Mat in all kernels
- L3 Gaussian Difference: added extra filter and sigma arguments
- Documentation corrections
- Fix make version compatibility issue in makefiles
2022.1 Update 1
Update
- Codec:
- add 6 kinds of encode/decode L2 APIs
- Dsp:
- fix some issue
- Making FIR's kernel members public
2022.1 Release
2022.1 Release Notes
Vitis Codec Library
Codec Library is an open-sourced library written in C/C++ for accelerating image coding, decoding and related processing algorithms. It now covers a level of acceleration: the module level(L1) and the pre-defined kernel level(L2).
The 2022.1 release provides a range of algorithms, including:
- JPEG decoding: one L2 API is provided for accelerating entire JPEG decoding process, which supports the ‘Sequential DCT-based mode’ of ISO/IEC 10918-1 standard. It can process 1 Huffman token and create up to 8 DCT coefficients within one cycle. It is also an easy-to-use decoder as it can directly parse the JPEG file header without help of software functions. In addition, L1 API is provided for Huffman decoding.
Vitis Database Library
- Merge partition / bloomfilter / join into single kernel make three operators shares resource on FPGA. Although such kernel could only performance one of the three operators at the same time, it will take much less resource than 3 stand alone kernels. With such design, it will help eliminate the time cost to switch xclbins for different operators. Also such design will enable pipelined execution of kernels and reduce DMA workload. This design targets for U50. U50 costs less and still retains HBM.
- Key-Value store offloading introduce a new kernel for accelerate K-V compaction operation in log-structure merge tree database.
Vitis Data Analytics Library
- csv scanner: Used to accelerate the extract, transform and load process. It integrates GZIP decompression, CSV parser, filter module together to make them work in parallel. ETL accelerator could work together with database to run queries on large size of semi-structured and unstructured data.
- Geospatial APIs: Two major APIs in this family have been included: the Spatial Join and KNN. The former API inserts the columns from one feature table to another based on location or proximity, while the latter is often used to find the K nearest neighbors around the center point. They are both vital for spatial analysis and spatial data mining.
There are some known issues for this release.
- Log Analyer in L2 demo fails hardware build with 2022.1 Vitis. Please use 2021.2 Vitis for it.
Vitis Data Compression Library
- ZLIB Compression Improvement:
- Reduced TreeGen Initial Interval < 1K to reduce overall resource utilization for 8KB octa core compression.>
- Customized Octa-Core compression for 8KB solution ( Reduced Booster Window 8KB).
- Static IP customized.
- Improved Compression IP Timing for Versal and achieved > 250MHz.
- Provided Memory Mapped GZIP File Decompression.
- ZLIB Decompression Improvement:
- Added ADL32 and provided uncompressed size in TUSER.
- Provided Quad-Core Decompress solution for 32KB and 8KB file size to achieve 4x throughput (upto 2GB/s).
Vitis DSP Library
- DDS / Mixer: The DDS/ Mixer library element now has extended type support. It now supports cfloat and cint32 for TT_DATA when configured as a mixer. When configured as a DDS, cfloat is now supported for TT_DATA. Additionally, the DDS/Mixer now supports Super Sample Rate operation for higher throughput.
- FFT/iFFT: FFT point size support has been extended to 65536. Performance has been improved approximately 10% for cases using PARALLEL_POWER>1 which were previously supported.
- FIR Filters: All FIR library elements now support streaming interfaces as well as window interfaces. The single rate asymmetric FIR variant now support Super Sample Rate operation for higher throughput. The FIR resampler library element has been added which performs fractional decimation. This supersedes the existing FIR interpolate fractional library unit. All FIR variants now support a larger maximum value for FIR_LEN, up to 8k depending on variant, data/coefficient type and API choice.
Vitis Genomics Library
Three new Genomics accelerators have been added:
- Smithwaterman Algorithm Created Smithwaterman Algorithm to provide high throughput than existing benchmark solution.
- PairHMM Algorithm Created PairHMM Algorithm to achieve architectural maximum performance.
- SMEM Algorithm Created SMEM Algorithm to achieve the required DRAM bandwidth overcoming the latency.
Vitis Graph Library
- Added the new algorithm Maximal Independent Set.
- Enhanced Louvain Modularity. L2 Louvain Modularity is able to support large-scale graphs.
- Added a L3 API to divide huge graphs into multiple parts and add other data structures to support the Louvain Modularity on these parts.
Vitis Solver Library
In this relese, the following legacy API from Vivado_HLS were migrated to solver library. They are all hls::stream based API that support std::complex type.
- Cholesky
- Cholesky Inverse
- QR Inverse
- QRF (QR decomposition)
- SVD
Vitis Vision Library
New features and functions
The below functions and pipelines are newly added into the library.
PL additions/enhancements:
- New functions:
- Rotate
- TV-L1 optical flow
- Multi-stream ISP (basic) support
- Updates
- Added Demosaicing kernel (xf_demosaicing_rt.hpp) having the input Bayer pattern as run time parameter.
- Lib Infra Changes
- Added API JSON for L2 which helps in usage of a given function’s API in the Vitis GUI
- Updates:
- Introduced RTL Data-movers with improved latency over HLS data-movers
- All tests updated with RTL data-movers
AIE additions/enhancements:
Known issues
- Vitis GUI projects on RHEL83 and CEntOS82 may fail because of a lib conflict in the LD_LIBRARY_PATH setting. User needs to remove ${env_var:LD_LIBRARY_PATH} from the project environment settings for the function to build successfully.
- SVM L2 PL function fails hardware emulation with 2022.1 Vitis. Use 2021.1 Vitis for this function.
- rgbir2bayer, isppipeline_rgbir PL functions are not supplied with input images
- Software emulation for Warptransform L2 testcases doesn’t work because of a known issue with platform.
- Warptransform L1 URAM cases fail CSim because of a known HLS issue.
- Hardware emulation in AIE testcases may throw segmentation fault at the end, although completing the functional test successfully.
2021.2 Update 2
Library Updates
- Release Vitis Genomics Library(genomics)
2021.2 Update 1
Library Updates
- Codec
- add OrderTokenize L1 API
- Compression
- gzip update
- Data Analytics
- add JSON parser L1 API
- DSP
- change 4 PL L1 APIs' interface from memory buffer to stream, and doc update
- Graph
- add Louvain API, including L2 and 2 L3 functions
- Solver
- add QRF L1 API
v2021.2 Release
v2021.2 Release Notes
Vitis Data Analytics Library
The 2021.2 release provides CSV Parser:
- CSV Parser parses comma-seperated value files and generates object stream, which are easily connected with DataFrame APIs. CSV is a common used storage format in Date Lake. CSV parser can accelerate the data extraction process.
Vitis Data Compression Library
- ZSTD Quad-Core Compression
- Created ZSTD Multi-Core architecture to provide high throughput for single file compression. Using Zstd Quad core solution, user can get throughput > 1 GB/s.
- Zstd Decompress Improvement
- ZSTD Decompress optimized in this release. Overall resource is reduced to 19.6K and achieve 20% higher throughput compare to previous release.
- GZIP Decompress Improvement
- Re-architected GZIP Decompress cores to reduce resource to 6.9K and better throughput compare to previous release. With this new latency overall IP latency is also reduced to ~1.5K cycle. Provided ZLIB decompression containing ADLR32 Checksum to catch any error in input file. Added functionality to provide uncompressed size in output stream port TUSER (incase end application needs to know uncompressed size).
- GZIP Compression Improvement
- Created various ZLIB/GZIP Octa-Core Compression Kernels for different block sizes (8KB, 16KB, 32KB) and achieved > 2GB/s throughput for all variants. Updated IP core to provided compressed size in output axis stream TUSER port (incase any application needs compressed size).
- Huffman TreeGen latency is reduced significantly < 1K, as a result, for multi-core architectures (Octa-core), a single Treegen is required. This reduce the resource requirement signficantly down for 8KB and 16KB blocksize compression core compare to previous release solution.
- Compression ratio is improved from 2.67 to 2.7 for Silesia Fileset for 32KB bloksize.
- Snappy/LZ4 Decompress Improvement
- Optimized Snappy and LZ4 Decompress throughput.
Vitis Database Library
In 2021.2 release, GQE starts to support asynchronous input / output feature, along with multi-card support.
- Asynchronous input / output: use std::future<size_t> to notify GQE L3 readiness of each input sections, and its value is the effective row number of the input section. It will use std::promise<size_t> to notify the caller of GQE L3 the readiness of each section of the final result, and its value is the effective row number of output section. Asynchronous support will allow the FPGA start to process as soon as part of the input data is ready. In such way, FPGA won’t wait until all input data is ready and shrink the overhead to prepare data for FPGA.
- Multi-Cards support: allows to identify multiple Alveo cards that suitable for working. It will load the same xclbins for these cards and called them when there’s more task than 1 cards could handle at the same time. The data structure will also keep pinned host buffer and device buffer alive before they’re explicitly released. This will help save the time to load xclbins / create pinned buffer / create device buffer.
Vitis DSP Library
The below features have been added to the library in this release.
- DDS / Mixer - new library element
Function | Namespace and class name |
---|---|
DDS / Mixer | xf::dsp::aie::mixer::dds_mixer |
This component may be configured to one of three modes. The first mode is a DDS only. The second mode is a single channel mixer. The third mode is a symmetrical mixer, taking two input channels and mixing each with the DDS output and the conjugate of DDS output respectively, combining the result to one output channel. DDS/Mixer supports window input/output interface, as well as streaming interface.
- FIR Filters
Single rate FIRs now support streaming interfaces as well as to window interfaces.
- FFT/iFFT
FFT now supports streaming interfaces as well as to window interfaces. In addition, FFT now offers improved performance and greater point size support with parallelization.
Vitis Graph Library
The algorithms implemented by Vitis Graph Library include:
- Similarity analysis: Cosine Similarity, Jaccard Similarity, k-nearest neighbor. From 2021.2, the ‘weight’ feature is supported for Cosin Similarity.
- Centrality analysis: PageRank.
- Pathfinding: Single Source Shortest Path (SSSP), Multi-Sources Shortest Path (MSSP).
- Connectivity analysis: Weekly Connected Components and Strongly Connected Components.
- Community Detection: Louvain Modularity, Label Propagation and Triangle Count.
- Search: Breadth First Search, 2-Hop Search
- Graph Format: Renumber(2021.2), Calculate Degree and Format Convert between CSR and CSC.
Vitis Security Library
The 2021.2 release provides support for:
- KECCAK-256
- CRC32C
Vitis Utilities Library
Adds two Data-Mover implementations for debugging hw issues:
- LoadDdrToStreamWithCounter: For loading data from PL’s DDR to AIE through AXI stream and recording the data count sending to AIE.
- StoreStreamToMasterWithCounter: For receiving data from AIE through AXI stream and saving them to PL’s DDR, as well as recording the data count sending to DDR.
Vitis Vision Library
New features and functions
The below functions and pipelines are newly added to the library:
Versal AI Engine additions :
- blobFromImage
- Function used in many ML pre-processing tasks to do normalization and other tasks.
- Back to back filter2D with batch size three support
- Application showcasing increasing throughput of single filter2D kernel, by doing 3, back-2-back filter2D achieving 555 FPS with PL datamovers.
New Programmable Logic (PL) functions and features:
- ISP pipeline and functions:
- End to End Mono Image Processing (ISP) pipeline with CLAHE TMO
- Useful for ISP pipelines with monochrome sensors
- RGB-IR along-with RGB-IR Image Processing (ISP) pipeline
- Useful for ISP pipelines with IR sensors
- Global Tone Mapping (GTM) along with an ISP pipeline using GTM
- Adding to growing TMO (tone-mapping-operators) in the library for different quality and area tradeoff purposes: CLAHE, Local Tone Mapping, Quantization and Dithering
Known issues
Vitis GUI projects on RHEL83 and CEntOS82 may fail because of a lib conflict in the LD_LIBRARY_PATH setting. User needs to remove ${env_var:LD_LIBRARY_PATH} from the project environment settings for the function to build successfully.
2021.1 Release
v2021.1 Release Notes
Vitis BLAS Library
The 2021.1 release introduces L2 kernels for GEMM and GEMV. It also introduces L3 APIs based on the XRT (Xilinx Runtime library) Native APIs.
Vitis Codec Library
This initial release provides a range of algorithms including:
- JPEG Decoder: “JPEG” stands for Joint Photographic Experts Group, the name of the committee that created the JPEG standard and also other still picture coding standards.
- JPEG-XL: JPEG XL is a raster-graphics file format that supports both lossy and lossless compression. It is designed to outperform existing raster formats and thus to become their universal replacement.
Vitis Data Analytics Library
The 2021.1 release provide Two-Gram text analytics:
- Two Gram Predicate (TGP) is a search of the inverted index with a term of 2 characters. For a dataset that established an inverted index, it can find the matching id in each record in the inverted index.
Vitis Data Compression Library
- GZIP Multi Core Compression
- New GZIP Multi-Core Compress Streaming Accelerator which is purely stream only solution (free running kernel), it comes with many variant of different block size support of 4KB, 8KB, 16KB and 32KB.
- Facebook ZSTD Compression Core
- New Facebook ZSTD Single Core Compression accelerator with block size 32KB. Multi-cores ZSTD compression is in progress (for higher throughput).
- GZIP Low Latency Decompression
- A new version of GZIP decompress with improved latency for each block, lesser resources (35% lower LUT, 83% lower BRAM) and improved FMax.
- ZLIB Whole Application Acceleration using U50
- L3 GZIP solution for U50 Platform, containing 6 Compression core to saturate full PCIe bandwidth. It is pr...