-
Notifications
You must be signed in to change notification settings - Fork 149
Multiple threads to perform BPE encoding #503
Conversation
@@ -74,7 +74,7 @@ final class TextUnsupervisedTests: XCTestCase { | |||
XCTAssertEqual(example.second.shape[0], 1024) | |||
totalCount += 1 | |||
} | |||
XCTAssertEqual(totalCount, 12) | |||
XCTAssertEqual(totalCount, 64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this increase the length of time it takes to run these tests during every presubmit? If so, let's keep the lower document count by being explicit during init.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I increased it to show that with this concurrency a larger number of documents won't increase test time. I'm fine to revert it back to 4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happened to these changes? I believe they will resolve the failure in #526
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the failure? I didn't touch the test eventually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In presubmits, I'm seeing:
/swift-models/Tests/DatasetsTests/TextUnsupervised/TextUnsupervisedTests.swift:77: error: TextUnsupervisedTests.testCreateWikiText2WithBpe : XCTAssertEqual failed: ("64") is not equal to ("12") -
The test passes for me locally with 12, so I don't quite understand this. I wonder if kokoro is picking up a cached image somehow... 🤔 I'll try running again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this approach since it cleans up in-line. Thank you!!
encodedDocs = documents.map { embedding(for: $0, bpe: bpe) } | ||
} else { | ||
encodedDocs = documents.concurrentMap { embedding(for: $0, bpe: bpe) } | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace tab with spaces
No description provided.