-
Notifications
You must be signed in to change notification settings - Fork 4
will having two split methods cause confusion? #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Most uses of As for confusion: maybe, but I don't anticipate much as the names are different. |
I don't think the names being different really matters here. There will be 2 string splitting methods with pretty similar semantics. I fear people (including myself) won't be able to remember which method has which semantics.
Why is that? I used the second parameter to propertyAliasesFile
.split('\n')
.map(line => line.split('#', 1)[0].trim())
.filter(line => line.length > 0) Was this bad for some reason? |
That seems reasonable! I haven't included that in the explainer, but I did a Sourcegraph search for people using the For that example specifically, now if you want to instead extract all of the comments rather than the non comments, using propertyAliasesFile
.split('\n')
.map(line => line.splitn('#', 2)[1])
.filter(line => line && line.length > 0) Doing this with current split is still possible, but is much less efficient. An extra intermediate array is created, the original string needs to be iterated over for longer to find split points, and many more small string allocations are needed for the intermediate array. propertyAliasesFile
.split('\n')
.map(line => line.split('#').slice(1).join('#'))
.filter(line => line.length > 0) |
Ah, that's actually helpful for me to better understand the motivation for this proposal. Still, I'm unsure whether the usefulness outweighs the risk of added confusion. |
I think going for the |
(fwiw, i don't like |
Another benefit of this is that it bypasses the questions about regex-based splitting. The behavior remains "do exactly as you normally would", the flag just controls whether the remainder gets added as a final item to the array or thrown out. |
At the matrix channel I suggested adopting a different name.
Another name I'm adding just for the bikeshed sake would be I prefer a new name over expanding the |
I opened an issue about interpretation of the count arg and though I think the Python behavior is best regardless, a substantially different name like |
@michaelficarra I don't think it's "splice" case. The "splice" is bad because:
Not sure about native speakers, but as who have bad English skill, I really don't know what "splice" meaning in English, I never use this word except writing code, so the name "splice" is just opaque to me.
I guess it's hard to have a good name for what "splice" really do, because the it do a complex work and unfortunately it doesn't follow Off-topic Instead of
Or even have slice notation syntax for that: /Off-topic
|
Actually I see it in another way. Similar names should have similar semantics. So after we have
I know it could always bring confusion if someone dig it and see
But it's more or less a documentation issue. The old broken |
@lucacasonato Yeah, as I said in the meeting , last week we in JSCIG meeting, we discussed this proposal, and we surprisingly found half of the attendances don't know So I don't worry confusion of two methods too much, actually people won't confused with something they might never know. 😂 |
@tabatkins What "bypass" mean? You know at first I also feel let s = 'axbxcxd'
let re = /(x)/
s.split(re) // [ 'a', 'x', 'b', 'x', 'c', 'x', 'd' ]
s.split(re, 3) // [ 'a', 'x', 'b' ] // oh !
s.split(re, 3, {reminder:true}) // what it should returns? The "consistent" but weird behavior might be return I'm curious how engine could easily return that result without post process on the result returned by underlying regexp engine. Note python |
@leobalter Though at first glance I also felt new name could avoid confusion , but after second thought, I have to say it might not very helpful like we thought. The point is, if it actually do very similar thing , people eventually would ask why there are two? Which one should I use? Essentially the confusion is not about the name, but about the usage of two similar apis. The name similarity is just make it easy to expose. Make the new named api have much different semantic like only support string might helpful (so people are ok with two api because they have different ability), might not (because people may expect the new api also accept regexp, so in the end we need to add similar type check to throw on regexp-like arg --- just like startsWith/endsWith). There is also another point, api discoverability. As my previous comment, people actually unaware At the end, for such fundamental api (it's not WeakRef, Atomics, etc.!), I don't believe we should decrease the confusion by decrease the discoverability. |
(Very off topic aside for @hax cause I thought the mention of splice was interesting)
I hadn’t considered this before, but I have a feeling this has to do less with one’s English and more to do with how certain physical media technologies have drifted out of the popular consciousness over time. It’s likely that this word has become less familiar among native speakers, too. The word describes (with an uncanny degree of precision) the exact things that the method does: splicing is cutting, inserting, removing, and rejoining segments of something that together form a line. Which seems like a weirdly specific word for most people to know to begin with, but in the 20th century, splicing was very important in at least two physical media technologies. You spliced tape and spliced film when creating and editing music and movies. (The word is older than that, but before film and tape, you’d probably have heard it mainly just in the context of splicing rope.) (If you were splicing tape, you’d make cuts in the tape in order to extract, insert, or replace a sub-portion. The harder part was rejoining the segments, i.e. without the part you removed and/or with the part you inserted, whichever are applicable. In older music, if the splicing job was sloppy, artifacts of the physical splicing would be audible in the music). Given both tape and film have rapidly became much less common and less visible in the last few decades, the word has probably become much more obscure, too. It’s not quite as frozen in time as “radio button” because splice is still a “generic” word, but the programming use of “splice” seems at least partly informed by an analogy with how the word was used in the context of splicing physical media. So at one time, it would have seemed like a very natural/obvious term, but you’re probably right that this is no longer true. |
I like using a |
I believe having this options bag adds too much complexity for this one method that was supposed to do this one thing, limited by another parameter. The option will open an object that might have a property that says: do this other thing, or that original thing but different, and now I'm slightly confused what that number parameter does. This fails the logic of having I also find I think it's good to bikeshed some synonyms such as |
I wouldn't expect a third parameter - i'd expect that the signature would be:
where |
If we reuse the existing function exactly as it is, with the flag just including the remainder, my intention is that this allows us to ignore the regex questions, because the answer would be "just act as normal". However, I have since learned that the interaction between a regex splitter and the N argument is worthless, so it doesn't really help us: "a|b|c|d".split(/\|/, 2);
// returns ["a", "b"]
"a|b|c|d".split(/(\|)/, 2);
// returns ["a", "|"] That is, the N argument does not mean "perform N splits", it is instead literally identical to just calling So yeah, going instead with either a new function, or taking the options arg as the second arg like @ljharb suggests, would be better, so we can give better behavior immediately. |
Today we can have things such as |
@leobalter that's true only if there's web code that relies on that, which i doubt. But also, we could still do a Get for the option names, and if no option names are found, we could ToNumber the object, which would still support that usage. |
Yeah I highly doubt we actually have live code depending on toValue() of the second arg, but Jordan's suggestion would let us still apply it if it proves necessary. |
Can you help me understand how the options would be like? The way I see it feels confusing. "a|b|c|d".split("|", 2); // ["a", "b"]
"a|b|c|d".split("|", { parts: 2, remainder: true }); // ["a", "b|c|d"]
"a|b|c|d".split("|", { parts: 2 }); // ["a", "b"] or ["a", "b|c|d"] I want the remainder option to be optional if anything, but the parts/limit option remains ambiguous. I'd like the options object to have actual options for any end. Requiring one to write the parts/limit option and the |
@leobalter i hadn't really thought about it much. Perhaps something like this:
etc? |
I still have a strong preference for two methods with different responsibilities.
Rather tan:
I prefer the single responsibility for each method. |
But it's still more or less confused that And It seems we should have an option bag which includes the old non-option bag behavior, this is what If we include it, then We could make Actually it seems much simpler to only have a one boolean option "legacyMode" So we could use The only concern is, we normally use But we still have the problem:
We could make the name shorten to single letter Even the shortest form
Previously I thought the reason people not use Though it's not the original motivation of this proposal, I think it should be the goal of the proposal. With these two reasons, I still prefer |
For better disclosure, I strongly oppose to sentence-based/composed option values. Coming from an ESL background, I know people see the language words as not other than keywords understanding the logic behind but without the actual meaningful dictionary information. I'm happy to tell more about this to anyone interested. |
I suppose As a non-native speaker, I should agree that "sentence-based/composed" are bad, but I want to say another (might be more) important point is we should use simple english words. Offtopic Here is a repo https://github.com/mahavivo/english-wordlists which contains different level english word lists in Chinese education system, I use it to check how hard a word is for non-native speakers. For example, "splice" is not in word list of CET4 or CET6 or even TEM8 (which is for graduate student major in English, and contains 13000 words). And as word usage frequency from COCA (Corpus of Contemporary American English) , "splice" is at 20171 😂 . So it's reflect the @bathos comment, even native speakers do not use that word much. /Offtopic |
Yes, i intentionally used a sentence as a placeholder, because obviously that wouldn’t fly. |
I fear we're just going to create a "splice" scenario where most people are going to just have to look at the documentation every time to figure out what it does now that there's more than one option.
The text was updated successfully, but these errors were encountered: