How do I handle duplicate names when creating folders?

The solution involves checking each folder name against a hash table to see if it already exists. If it does, append the smallest integer to make it unique.

What is the time complexity of the solution?

The time complexity is O(n), where n is the number of names, as each name is processed in constant time due to hash table lookups.

How do hash tables help solve this problem efficiently?

Hash tables allow for constant time lookup and insertion, making it easy to check for duplicates and generate the correct suffix when needed.

What happens if the same name appears multiple times?

Each time the same name appears, a suffix is added with the smallest integer that ensures the name remains unique. For example, 'name(1)', 'name(2)', etc.

Can this problem be solved using only an array?

While it is possible to use an array, a hash table is much more efficient for this problem, providing faster lookups and inserts.

#1487

Medium

auto_awesome数组·哈希·扫描

LeetCode 题解工作台

保证文件名唯一

给你一个长度为 n 的字符串数组 names 。你将会在文件系统中创建 n 个文件夹：在第 i 分钟，新建名为 names[i] 的文件夹。由于两个文件不能共享相同的文件名，因此如果新建文件夹使用的文件名已经被占用，系统会以 (k) 的形式为新文件夹的文件名添加后缀，其中 k 是能保证文件名唯…

数组哈希表字符串

题目描述

给你一个长度为 n 的字符串数组 names 。你将会在文件系统中创建 n 个文件夹：在第 i 分钟，新建名为 names[i] 的文件夹。

由于两个文件不能共享相同的文件名，因此如果新建文件夹使用的文件名已经被占用，系统会以 (k) 的形式为新文件夹的文件名添加后缀，其中 k 是能保证文件名唯一的 最小正整数 。

返回长度为 n 的字符串数组，其中 ans[i] 是创建第 i 个文件夹时系统分配给该文件夹的实际名称。

示例 1：

输入：names = ["pes","fifa","gta","pes(2019)"]
输出：["pes","fifa","gta","pes(2019)"]
解释：文件系统将会这样创建文件名：
"pes" --> 之前未分配，仍为 "pes"
"fifa" --> 之前未分配，仍为 "fifa"
"gta" --> 之前未分配，仍为 "gta"
"pes(2019)" --> 之前未分配，仍为 "pes(2019)"

示例 2：

输入：names = ["gta","gta(1)","gta","avalon"]
输出：["gta","gta(1)","gta(2)","avalon"]
解释：文件系统将会这样创建文件名：
"gta" --> 之前未分配，仍为 "gta"
"gta(1)" --> 之前未分配，仍为 "gta(1)"
"gta" --> 文件名被占用，系统为该名称添加后缀 (k)，由于 "gta(1)" 也被占用，所以 k = 2 。实际创建的文件名为 "gta(2)" 。
"avalon" --> 之前未分配，仍为 "avalon"

示例 3：

输入：names = ["onepiece","onepiece(1)","onepiece(2)","onepiece(3)","onepiece"]
输出：["onepiece","onepiece(1)","onepiece(2)","onepiece(3)","onepiece(4)"]
解释：当创建最后一个文件夹时，最小的正有效 k 为 4 ，文件名变为 "onepiece(4)"。

示例 4：

输入：names = ["wano","wano","wano","wano"]
输出：["wano","wano(1)","wano(2)","wano(3)"]
解释：每次创建文件夹 "wano" 时，只需增加后缀中 k 的值即可。

示例 5：

输入：names = ["kaido","kaido(1)","kaido","kaido(1)"]
输出：["kaido","kaido(1)","kaido(2)","kaido(1)(1)"]
解释：注意，如果含后缀文件名被占用，那么系统也会按规则在名称后添加新的后缀 (k) 。

提示：

1 <= names.length <= 5 * 10^4
1 <= names[i].length <= 20
names[i] 由小写英文字母、数字和/或圆括号组成。

lightbulb

解题思路

方法一：哈希表

我们可以用哈希表 $d$ 记录每个文件夹的最小可用编号，其中 $d[name] = k$ 表示文件夹 $name$ 的最小可用编号为 $k$ 。初始时， $d$ 中没有任何文件夹，因此 $d$ 为空。

接下来遍历文件夹数组，对于每个文件名 $name$ ：

如果 $name$ 在 $d$ 中，说明文件夹 $name$ 已经存在，我们需要找到一个新的文件夹名字。我们可以不断地尝试 $name(k)$ ，其中 $k$ 从 $d[name]$ 开始，直到找到一个文件夹名字 $name(k)$ 不存在于 $d$ 中为止。我们将 $name(k)$ 加入 $d$ 中，并将 $d[name]$ 更新为 $k + 1$ 。然后我们将 $name$ 更新为 $name(k)$ 。
如果 $name$ 不在 $d$ 中，我们可以直接将 $name$ 加入 $d$ 中，并将 $d[name]$ 更新为 $1$ 。
接着我们将 $name$ 加入答案数组。然后继续遍历下一个文件名。

遍历完所有文件名后，我们即可得到答案数组。

在以下代码实现中，我们直接修改文件名数组 $names$ ，而不使用额外的答案数组。

时间复杂度 $O(L)$ ，空间复杂度 $O(L)$ ，其中 $L$ 为数组 $names$ 中所有文件名的长度之和。

1

2

3

4

5

6

7

8

9

10

11

12

13

class Solution:
    def getFolderNames(self, names: List[str]) -> List[str]:
        d = defaultdict(int)
        for i, name in enumerate(names):
            if name in d:
                k = d[name]
                while f'{name}({k})' in d:
                    k += 1
                d[name] = k + 1
                names[i] = f'{name}({k})'
            d[names[i]] = 1
        return names

speed

复杂度分析

指标	值
时间	Depends on the final approach
空间	Depends on the final approach

psychology

面试官常问的追问

外企场景

question_mark
Can the candidate explain the trade-off between using a hash table and simpler data structures like arrays?
question_mark
Does the candidate understand the need for tracking names and suffixes to handle duplicates efficiently?
question_mark
Can the candidate clearly articulate why the problem requires constant time lookups for scalability?

warning

常见陷阱

外企场景

error
Not updating the counter for suffix generation correctly, leading to collisions in names.
error
Using a more complex data structure than necessary (like a list instead of a hash table) that would increase time complexity.
error
Failing to handle the case where the same name is used multiple times with multiple suffixes, and not properly appending the smallest integer.

swap_horiz

进阶变体

外企场景

arrow_right_alt
Instead of appending a number as a suffix, the problem could involve appending letters or symbols to ensure uniqueness.
arrow_right_alt
The input could contain a mixture of already unique names and duplicates, testing the ability to efficiently manage both cases.
arrow_right_alt
The problem could be extended to handle additional constraints, such as large file systems where names need to be stored and updated in real time.

help

常见问题

外企场景

继续练习

#1452 收藏清单 #1436 旅行终点站 #1418 点菜展示表