Get Faster and More Relevant Search Results (Generally Available)
| Available in: All Editions |
In the Winter ’15 release, Salesforce Knowledge article search was updated with the new search infrastructure. In Spring ’15, we’re expanding this search infrastructure to all Salesforce search utilities. This expanded enhancement was previously available only through a pilot program.
These improvements and new search capabilities are available on a rolling basis after Spring ’15 is released, and before the Summer ’15 release is rolled out.
- Faster indexing
- It now takes less time for records to be searchable after they’re created or updated.
- Improved tokenization
- A key enhancement of the new search infrastructure is the change from bigram tokenization to morphological tokenization. When users search, the search engine returns results based on tokens in the search string that match tokens in the index. With improved tokenization, content is indexed more appropriately, resulting in fewer irrelevant matches in search results.
- Morphological tokenization ensures that searches in East Asian languages such as Chinese, Japanese, Korean, and Thai (CJKT), which don’t include spaces between words, return accurate search results. Previously, when indexing a string of characters, the search engine applied bigram tokenization to segment the string into pairs of characters, known as bigrams.
- For example, before Spring ’15, a search for 京都 (Kyoto) in Japanese incorrectly included 東京都 (Tokyo Prefecture) in the search results.
- Using bigramming, 東京都 (Tokyo Prefecture)
was tokenized as these bigrams.
東京 Tokyo
京都 Kyoto
- With morphological tokenization, the same phrase is properly segmented into these
tokens.
In this context, both tokens are meaningful and correct, and 京都 (Kyoto) isn’t tokenized.東京 Tokyo
都 Prefecture
Now, a search for 京都 (Kyoto) returns only results that include 京都 (Kyoto) and not 東京都 (Tokyo Prefecture).
- Limitation with Japanese language users querying records that are tokenized as Chinese
- If a record contains at least 300 characters and contains Kanji only (no Katakana or Hiragana), the content is tokenized as Chinese. Therefore, a Japanese language user searching for this record doesn’t find it in search results. Kanji-only records with fewer than 300 characters are tokenized in Chinese and Japanese.
- Improved alphanumeric search
- Thanks to more efficient handling of punctuation, we’ve improved search results when you
search for specialized strings such as URLs, email addresses, and phone numbers.
Punctuation symbols—<>[]{}()!,.;:"'— at the
beginning or end of a tokenized string are removed from indexed content and from users’
searches. Removing these characters makes it easier for the search engine to recognize
when a user is searching for a phone number, as in this example:
(415) is tokenized as 415.
Previously, if a user searched for きっと、来る in Japanese, the punctuation caused this matching string to be excluded from search results: きっと来る. Now, the same string results in a match.
-
Words that contain both letters and numbers are split into
separate tokens. For example, web2lead is broken up into these tokens:
web2lead, web2, web, 2, and lead. A search for
web matches items that contain web2lead; however, a search
for web2lead only returns results that include the full term,
web2lead.
Previously, a search for web2lead returned matches with web, 2, and lead, even if those terms were in separate places within the item.
As another example, a record name that includes letters, numbers, and punctuation is broken up into several tokens.A search for any of these indexed tokens returns the record ABC-Record-XYZ1234.Record Name Indexed Tokens ABC-Record-XYZ1234 ABC-Record-XYZ1234 ABCRecordXYZ1234
ABC
Record
1234
XYZ1234
XYZ
- Further, when you search for an exact match, using either quotation marks (“”) or
sidebar search, special characters are treated as part of the search term to help you find
the record that you’re looking for. For example, if you search for
100!%, we match only 100!%. We don’t match items with
100%.
Before Spring ’15, if you searched for an exact match for 100!%, we matched items with 100 or 100%.
- Improved validation of the AND NOT operator
- In searches that don’t include a word before and after the AND NOT operator, “and” and “not” are included in the search term. For example, a search for AND NOT apples returns items with the word apples, while a search for oranges AND NOT apples doesn’t return items with the word apples.
For more information about searching on the new search infrastructure, see “How Search Works” in the Salesforce Help.

