How do I calculate f(s) for a string?

Scan the string to find the lexicographically smallest character and count its occurrences; this gives f(s).

Why precompute word frequencies instead of comparing each query directly?

Precomputing and sorting frequencies allows O(logN) comparisons per query instead of scanning all words repeatedly, improving efficiency.

Can I use a hash map instead of sorting?

Yes, a hash map of frequency counts works, but sorting simplifies counting how many words exceed a given frequency and aligns with the array scanning plus hash lookup pattern.

What if multiple characters are tied for smallest in a string?

Pick the lexicographically smallest character among them and count only its frequency for f(s).

Does this approach handle the LeetCode problem Compare Strings by Frequency of the Smallest Character efficiently?

Yes, by precomputing word frequencies and using binary search or scanning for queries, it ensures fast and correct results across all test cases.

#1170

Medium

auto_awesome数组·哈希·扫描

LeetCode 题解工作台

比较字符串最小字母出现频次

定义一个函数 f(s) ，统计 s 中（按字典序比较）最小字母的出现频次，其中 s 是一个非空字符串。例如，若 s = "dcce" ，那么 f(s) = 2 ，因为字典序最小字母是 "c" ，它出现了 2 次。现在，给你两个字符串数组待查表 queries 和词汇表 words 。对于每次…

数组哈希表字符串二分查找排序

题目描述

定义一个函数 f(s)，统计 s 中（按字典序比较）最小字母的出现频次 ，其中 s 是一个非空字符串。

例如，若 s = "dcce"，那么 f(s) = 2，因为字典序最小字母是 "c"，它出现了 2 次。

现在，给你两个字符串数组待查表 queries 和词汇表 words 。对于每次查询 queries[i] ，需统计 words 中满足 f(queries[i]) < f(W) 的 词的数目 ，W 表示词汇表 words 中的每个词。

请你返回一个整数数组 answer 作为答案，其中每个 answer[i] 是第 i 次查询的结果。

示例 1：

输入：queries = ["cbd"], words = ["zaaaz"]
输出：[1]
解释：查询 f("cbd") = 1，而 f("zaaaz") = 3 所以 f("cbd") < f("zaaaz")。

示例 2：

输入：queries = ["bbb","cc"], words = ["a","aa","aaa","aaaa"]
输出：[1,2]
解释：第一个查询 f("bbb") < f("aaaa")，第二个查询 f("aaa") 和 f("aaaa") 都 > f("cc")。

提示：

1 <= queries.length <= 2000
1 <= words.length <= 2000
1 <= queries[i].length, words[i].length <= 10
queries[i][j]、words[i][j] 都由小写英文字母组成

lightbulb

解题思路

方法一：排序 + 二分查找

我们先按照题目描述，实现函数 $f(s)$ ，函数返回字符串 $s$ 中按字典序比较最小字母的出现频次。

接下来，我们将 $words$ 中的每个字符串 $w$ 都计算出 $f(w)$ ，并将其排序，存放在数组 $nums$ 中。

然后，我们遍历 $queries$ 中的每个字符串 $q$ ，在 $nums$ 中二分查找第一个大于 $f(q)$ 的位置 $i$ ，则 $nums$ 中下标 $i$ 及其后面的元素都满足 $f(q) < f(W)$ ，那么当前查询的答案就是 $n - i$ 。

时间复杂度 $O((n + q) \times M)$ ，空间复杂度 $O(n)$ 。其中 $n$ 和 $q$ 分别是数组 $words$ 和 $queries$ 的长度，而 $M$ 是字符串的最大长度。

1

2

3

4

5

6

7

8

9

10

class Solution:
    def numSmallerByFrequency(self, queries: List[str], words: List[str]) -> List[int]:
        def f(s: str) -> int:
            cnt = Counter(s)
            return next(cnt[c] for c in ascii_lowercase if cnt[c])

        n = len(words)
        nums = sorted(f(w) for w in words)
        return [n - bisect_right(nums, f(q)) for q in queries]

speed

复杂度分析

指标	值
时间	complexity is O(NlogN + MlogN) where N is the number of words and M is the number of queries, due to sorting word frequencies and querying each. Space complexity is O(N) for storing word frequency counts.
空间	Depends on the final approach

psychology

面试官常问的追问

外企场景

question_mark
Do you precompute any values for words to avoid repeated work?
question_mark
Can you explain how array scanning and frequency counting connect to the hash lookup?
question_mark
How would you handle very large queries efficiently without scanning words repeatedly?

warning

常见陷阱

外企场景

error
Forgetting to count only strictly greater frequencies for queries.
error
Recomputing f(W) for each query instead of precomputing and reusing.
error
Mishandling edge cases where multiple characters tie as the smallest in a word.

swap_horiz

进阶变体

外企场景

arrow_right_alt
Return counts where f(queries[i]) <= f(W) instead of strictly less than.
arrow_right_alt
Find the maximum difference f(W) - f(query) for each query.
arrow_right_alt
Handle uppercase and lowercase letters with same frequency comparison logic.

help

常见问题

外企场景

继续练习

#792 匹配子序列的单词数 #1169 查询无效交易 #1202 交换字符串中的元素