-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[libc][wchar] implement wcslen #124150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[libc][wchar] implement wcslen #124150
Changes from all commits
a8173a1
6f386ab
b0b7b9a
4f34e64
6c7f5fd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,12 +14,13 @@ | |
#ifndef LLVM_LIBC_SRC_STRING_STRING_UTILS_H | ||
#define LLVM_LIBC_SRC_STRING_STRING_UTILS_H | ||
|
||
#include "hdr/types/size_t.h" | ||
#include "src/__support/CPP/bitset.h" | ||
#include "src/__support/CPP/type_traits.h" // cpp::is_same_v | ||
#include "src/__support/macros/config.h" | ||
#include "src/__support/macros/optimization.h" // LIBC_UNLIKELY | ||
#include "src/string/memory_utils/inline_bzero.h" | ||
#include "src/string/memory_utils/inline_memcpy.h" | ||
#include <stddef.h> // For size_t | ||
|
||
namespace LIBC_NAMESPACE_DECL { | ||
namespace internal { | ||
|
@@ -79,24 +80,21 @@ LIBC_INLINE size_t string_length_wide_read(const char *src) { | |
return char_ptr - src; | ||
} | ||
|
||
LIBC_INLINE size_t string_length_byte_read(const char *src) { | ||
size_t length; | ||
for (length = 0; *src; ++src, ++length) | ||
; | ||
return length; | ||
} | ||
|
||
// Returns the length of a string, denoted by the first occurrence | ||
// of a null terminator. | ||
LIBC_INLINE size_t string_length(const char *src) { | ||
template <typename T> LIBC_INLINE size_t string_length(const T *src) { | ||
#ifdef LIBC_COPT_STRING_UNSAFE_WIDE_READ | ||
// Unsigned int is the default size for most processors, and on x86-64 it | ||
// performs better than larger sizes when the src pointer can't be assumed to | ||
// be aligned to a word boundary, so it's the size we use for reading the | ||
// string a block at a time. | ||
return string_length_wide_read<unsigned int>(src); | ||
if constexpr (cpp::is_same_v<T, char>) | ||
return string_length_wide_read<unsigned int>(src); | ||
#else | ||
return string_length_byte_read(src); | ||
size_t length; | ||
for (length = 0; *src; ++src, ++length) | ||
; | ||
return length; | ||
Comment on lines
+94
to
+97
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. would it make sense to keep There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't see a reason why we wouldn't want There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ah, reading back through the comments I see you already tried this. In the case would it make sense to have a generic template for |
||
#endif | ||
} | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
//===-- Implementation of wcslen ------------------------------------------===// | ||
// | ||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#include "src/wchar/wcslen.h" | ||
|
||
#include "hdr/types/size_t.h" | ||
#include "hdr/types/wchar_t.h" | ||
#include "src/__support/common.h" | ||
#include "src/__support/macros/config.h" | ||
#include "src/string/string_utils.h" // string_length_trivial | ||
|
||
namespace LIBC_NAMESPACE_DECL { | ||
|
||
LLVM_LIBC_FUNCTION(size_t, wcslen, (const wchar_t *src)) { | ||
return internal::string_length(src); | ||
} | ||
|
||
} // namespace LIBC_NAMESPACE_DECL |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
//===-- Implementation header for wcslen ----------------------------------===// | ||
// | ||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#ifndef LLVM_LIBC_SRC_WCHAR_WCSLEN_H | ||
#define LLVM_LIBC_SRC_WCHAR_WCSLEN_H | ||
|
||
#include "hdr/types/size_t.h" | ||
#include "hdr/types/wchar_t.h" | ||
#include "src/__support/macros/config.h" | ||
|
||
namespace LIBC_NAMESPACE_DECL { | ||
|
||
size_t wcslen(const wchar_t *src); | ||
|
||
} // namespace LIBC_NAMESPACE_DECL | ||
|
||
#endif // LLVM_LIBC_SRC_WCHAR_WCSLEN_H |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
//===-- Unittests for wcslen ----------------------------------------------===// | ||
// | ||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#include "hdr/types/size_t.h" | ||
#include "hdr/types/wchar_t.h" | ||
#include "src/wchar/wcslen.h" | ||
#include "test/UnitTest/Test.h" | ||
|
||
TEST(LlvmLibcWCSLenTest, EmptyString) { | ||
ASSERT_EQ(size_t{0}, LIBC_NAMESPACE::wcslen(L"")); | ||
} | ||
|
||
TEST(LlvmLibcWCSLenTest, AnyString) { | ||
ASSERT_EQ(size_t{12}, LIBC_NAMESPACE::wcslen(L"Hello World!")); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's what I was imagining:
The reason is to avoid issues around passing incorrect types to
string_length_wide_read
, I thinkif constexpr
is evaluated late enough in the process you might get warnings about it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The duplication between the two versions (for loop) isn't as DRY as I prefer; I don't think the above approach can be simplified to have the
char
specialized version ofstring_length
call the non-specialized version.The parameter of
string_length_wide_read
isn't the template variable; theif constexpr
statement as written compiles without warning. https://godbolt.org/z/ajYhWbdddNow that I am pulling in cpp::type_traits, I could add some SFINAE to
string_length
so that it can only be called withchar*
andwchar_t*
, if you'd like?I don't feel strongly about any of the above, but please let me know what you prefer. I'll wait to land this until then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at it written out, it's probably overcomplicating things to add the template specialization just to avoid the
if constexpr
. If it compiles without warnings, it's probably fine.I don't think we really need to add the SFINAE to limit the template. In practice I expect it'll be used for
char
,wchar_t
, and maybeunsigned char
, but it will behave as expected for any other integer type. Adding the check seems like unnecessary complexity for little practical benefit.