LeetCode Problem Workspace

Remove Sub-Folders from the Filesystem

Remove sub-folders in a filesystem by filtering out folders nested inside other folders.

category

4

Topics

code_blocks

6

Code langs

hub

3

Related

Practice Focus

Medium · Array plus String

bolt

Answer-first summary

Remove sub-folders in a filesystem by filtering out folders nested inside other folders.

Interview AiBox logo

Ace coding interviews with Interview AiBox guidance for Array plus String

Try AiBox Copilotarrow_forward

The problem involves removing sub-folders from a list of folders, ensuring no folder is a sub-folder of another. A common approach is to sort the folder paths lexicographically and then iterate through them to remove nested folders. Efficient solutions use Array-based strategies alongside Depth-First Search or Trie-based methods.

Problem Statement

You are given a list of folder paths in a filesystem, and your task is to return the list of folders after removing all sub-folders. A folder is considered a sub-folder of another if its path starts with the parent's path, followed by a slash ('/'). The result should not contain any folder that is a sub-folder of another, and the order of the folders in the output does not matter.

For example, in the input list ['a/b', 'a/c', 'a/b/c'], 'a/b/c' is a sub-folder of 'a/b' and should be excluded. The goal is to find an efficient way to filter out these nested folders.

Examples

Example 1

Input: folder = ["/a","/a/b","/c/d","/c/d/e","/c/f"]

Output: ["/a","/c/d","/c/f"]

Folders "/a/b" is a subfolder of "/a" and "/c/d/e" is inside of folder "/c/d" in our filesystem.

Example 2

Input: folder = ["/a","/a/b/c","/a/b/d"]

Output: ["/a"]

Folders "/a/b/c" and "/a/b/d" will be removed because they are subfolders of "/a".

Example 3

Input: folder = ["/a/b/c","/a/b/ca","/a/b/d"]

Output: ["/a/b/c","/a/b/ca","/a/b/d"]

Example details omitted.

Constraints

  • 1 <= folder.length <= 4 * 104
  • 2 <= folder[i].length <= 100
  • folder[i] contains only lowercase letters and '/'.
  • folder[i] always starts with the character '/'.
  • Each folder name is unique.

Solution Approach

Sort Lexicographically

Start by sorting the folder paths in lexicographical order. This ensures that when we process the list, parent folders always appear before their sub-folders. This ordering simplifies the detection of sub-folders, as any folder that starts with another folder's path (plus '/') will be its sub-folder.

Iterate and Filter

After sorting, iterate through the list of folder paths. For each folder, compare it with the last added folder. If the current folder starts with the last added folder's path (plus '/'), it is a sub-folder and should be ignored. Otherwise, add it to the result list.

Optimize Space and Time

The approach uses sorting and a single pass through the list, achieving an efficient solution with time complexity O(N * L) and space complexity O(N * L), where N is the number of folders and L is the length of each folder path.

Complexity Analysis

Metric Value
Time O(N \times L)
Space O(N \times L)

The solution sorts the folder list, which takes O(N log N) time, followed by a linear scan to filter sub-folders, resulting in O(N * L) time complexity. The space complexity is O(N * L) due to the storage of the folder paths in the result list.

What Interviewers Usually Probe

  • Look for understanding of sorting techniques and their applications in path comparison.
  • Expect candidates to demonstrate a clear approach to handling nested structures with efficient filtering.
  • Assess whether the candidate can balance time complexity with simplicity, particularly regarding sorting and iteration.

Common Pitfalls or Variants

Common pitfalls

  • Not sorting the list, which can lead to incorrect results when detecting sub-folders.
  • Forgetting to account for the '/' separator when comparing folder paths.
  • Overcomplicating the solution with unnecessary data structures or algorithms when a simple sorting and iteration approach is sufficient.

Follow-up variants

  • What if folder paths have different levels of depth, or are not sorted initially?
  • How would this approach change if the folder paths could be represented in a tree structure instead of an array?
  • What if the problem extended to a scenario where sub-folders had additional metadata?

FAQ

What is the main approach to solving the 'Remove Sub-Folders from the Filesystem' problem?

The main approach involves sorting the folder paths lexicographically and then filtering out sub-folders by comparing each folder with the previously added folder.

How does sorting help in solving this problem?

Sorting ensures that the parent folders come before their sub-folders, making it easier to identify and exclude sub-folders during the iteration process.

What is the time complexity of this solution?

The time complexity is O(N * L), where N is the number of folders and L is the average length of a folder path.

Can this problem be solved with a Trie data structure?

Yes, a Trie can be used to represent the folder paths and help identify and exclude sub-folders efficiently. However, sorting the paths is often a simpler approach.

How does GhostInterview assist with this problem?

GhostInterview offers structured hints and interactive examples that guide you through sorting, filtering, and optimizing solutions for the 'Remove Sub-Folders from the Filesystem' problem.

terminal

Solution

Solution 1: Sorting

First, we sort the array `folder` in lexicographical order, then traverse the array. For the current folder $f$ we are traversing, if its length is greater than or equal to the length of the last folder in the answer array, and its prefix includes the last folder in the answer array plus a `/`, then $f$ is a subfolder of the last folder in the answer array, and we don't need to add it to the answer array. Otherwise, we add $f$ to the answer array.

1
2
3
4
5
6
7
8
9
class Solution:
    def removeSubfolders(self, folder: List[str]) -> List[str]:
        folder.sort()
        ans = [folder[0]]
        for f in folder[1:]:
            m, n = len(ans[-1]), len(f)
            if m >= n or not (ans[-1] == f[:m] and f[m] == '/'):
                ans.append(f)
        return ans

Solution 2: Trie

We can use a trie to store all the folders in the array `folder`. Each node of the trie contains a `children` field, used to store the child nodes of the current node, and a `fid` field, used to store the index of the folder corresponding to the current node in the array `folder`.

1
2
3
4
5
6
7
8
9
class Solution:
    def removeSubfolders(self, folder: List[str]) -> List[str]:
        folder.sort()
        ans = [folder[0]]
        for f in folder[1:]:
            m, n = len(ans[-1]), len(f)
            if m >= n or not (ans[-1] == f[:m] and f[m] == '/'):
                ans.append(f)
        return ans
Remove Sub-Folders from the Filesystem Solution: Array plus String | LeetCode #1233 Medium