Skip to content

5.x merge 4.x #26404

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Nov 6, 2024
Merged

5.x merge 4.x #26404

merged 23 commits into from
Nov 6, 2024

Conversation

asmorkalov
Copy link
Contributor

@asmorkalov asmorkalov commented Nov 2, 2024

OpenCV Contrib: opencv/opencv_contrib#3819
OpenCV Extra: no changes

#26203 from FantasqueX:generic-simd-warpAffineBlocklineNN
#26318 from hanliutong:rvv-intrin-m2
#26356 from hardikkamboj:4.x
#26357 from dkurt:dkurt/ov_out_names_from_graph
#26368 from hanliutong:rvv-hal-license
#26370 from mshabunin:fix-winrt-warnings
#26374 from OrkWard:fix-js-build-script
#26381 from dkurt:dk/hotfix_dnn_debug
#26384 from mshabunin:fix-winrt-warnings-2
#26388 from vrabaud:4_8u
#26390 from asmorkalov:as/kleidicv_no_sme2
#26402 from asmorkalov:as/win_uwp_ci

Previous "Merge 4.x": #26358

FantasqueX and others added 22 commits October 14, 2024 01:28
Changed "If the pixel value is smaller than the threshold" to "If the pixel value is smaller than or equal to the threshold" to make the line align with the working of the code.
OpenVINO friendly output names from non-compiled Model
Use LMUL=2 in the RISC-V Vector (RVV) backend of Universal Intrinsic. opencv#26318

The modification of this patch involves the RVV backend of Universal Intrinsic, replacing `LMUL=1` with `LMUL=2`.

Now each Universal Intrinsic type actually corresponds to two RVV vector registers, and each Intrinsic function also operates two vector registers. Considering that algorithms written using Universal Intrinsic usually do not use the maximum number of registers, this can help the RVV backend utilize more register resources without modifying the algorithm implementation

This patch is generally beneficial in performance.

We compiled OpenCV with `Clang-19.1.1` and `GCC-14.2.0` , ran it on `CanMV-k230` and `Banana-Pi F3`. Then we have four scenarios on combinations of compilers and devices. In `opencv_perf_core`, there are 3363 cases, of which:
- 901 (26.8%) cases achieved more than `5%` performance improvement in all four scenarios, and the average speedup of these test cases (compared to scalar) increased from `3.35x` to `4.35x`
- 75 (2.2%) cases had more than `5%` performance loss in all four scenarios, indicating that these cases are better with `LMUL=1` instead of `LMUL=2`. This involves `Mat_Transform`, `hasNonZero`, `KMeans`, `meanStdDev`, `merge` and `norm2`. Among them, `Mat_Transform` only has performance degradation in a few cases (`8UC3`), and the actual execution time of `hasNonZero` is so short that it can be ignored. For `KMeans`, `meanStdDev`, `merge` and `norm2`, we should be able to use the HAL to optimize/restore their performance. (In fact, we have already done this for `merge`  opencv#26216 )

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
Update py_thresholding.markdown
WinRT/UWP build: fix some specific warnings
Add the missing license header in hal_rvv.
Fix incorrect string format in js build script opencv#26374

I accidentally met this small problem mentioned in opencv#25084 (comment) when play with wasm build. It seems https://github.com/EDVTAZ didn't fix it yet, so I create this tiny pr.

Additionally, I remove a redundant argument in `add_argument` call. `'store_true'` already set the default, see https://docs.python.org/3/library/argparse.html#action.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
WinRT/UWP build: fix more warnings in media part
Disable SME2 branches in KleidiCV as it's incompatible with some CLang versions, e.g. NDK 28b1
…neBlocklineNN

Use generic SIMD in warpAffineBlocklineNN
Added Universal Windows Package build to CI.
@asmorkalov
Copy link
Contributor Author

@mshabunin @opencv-alalek could you take a look?

@asmorkalov
Copy link
Contributor Author

@hanliutong I just ignored 4.x changes for RVV and took 5.x version as soon as you provided dedicated patch. Please take a look and let me know, if I miss something important.

@hanliutong
Copy link
Contributor

ignored 4.x changes for RVV and took 5.x version

@asmorkalov Yes, then #26318 should not be merged into 5.x

@asmorkalov asmorkalov merged commit 0398354 into opencv:5.x Nov 6, 2024
17 of 27 checks passed
@asmorkalov asmorkalov mentioned this pull request Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants