feat: add .ts_ignore pattern ignoring system (#897)

* feat: add `.ts_ignore` pattern ignoring system

* fix: add wcmatch dependency

* search: add ".TemporaryItems" to GLOBAL_IGNORE

* add `desktop.ini` and `.localized` to global ignore

* add ".fhdx" and ".ts" filetypes

* chore: remove logging statement

* chore: format with ruff

* feat: use ripgrep for scanning if available

* docs: add ignore.md

* search: remove ts_ignore filtering on queries

* feat: detect if files are added but ignored

* fix: render edges on all unlinked thumbs

* perf: don't search for cached unlinked thumbs

* fix(ui): ensure newlines in file stats

* fix: use ignore_to_glob for wcmatch

* fix(tests): remove inconsistent test

The test hinged on the timing of refresh_dir()'s yield's rather than actual values

* ui: change ignored icon and color
This commit is contained in:
Travis Abendshien
2025-08-21 15:50:59 -07:00
committed by GitHub
parent d00546d5fe
commit 0e7a2dfd3d
23 changed files with 911 additions and 87 deletions

View File

@@ -211,6 +211,14 @@ Don't forget to rebuild!
## Third-Party Dependencies
<!-- prettier-ignore -->
!!! tip
You can check to see if any of these dependencies are correctly located by launching TagStudio and going to "About TagStudio" in the menu bar.
### FFmpeg/FFprobe
For audio/video thumbnails and playback you'll need [FFmpeg](https://ffmpeg.org/download.html) installed on your system. If you encounter any issues with this, please reference our [FFmpeg Help](./help/ffmpeg.md) guide.
You can check to see if FFmpeg and FFprobe are correctly located by launching TagStudio and going to "About TagStudio" in the menu bar.
### ripgrep
A recommended tool to improve the performance of directory scanning is [`ripgrep`](https://github.com/BurntSushi/ripgrep), a Rust-based directory walker that natively integrates with our [`.ts_ignore`](./utilities/ignore.md) (`.gitignore`-style) pattern matching system for excluding files and directories. Ripgrep is already pre-installed on some Linux distributions and also available from several package managers.

321
docs/utilities/ignore.md Normal file
View File

@@ -0,0 +1,321 @@
---
title: Ignore Files
---
# :material-file-document-remove: Ignore Files & Directories
<!-- prettier-ignore -->
!!! warning "Legacy File Extension Ignoring"
TagStudio versions prior to v9.5.4 use a different, more limited method to exclude or include file extensions from your library and subsequent searches. Opening a pre-exiting library in v9.5.4 or later will non-destructively convert this to the newer, more extensive `.ts_ignore` format.
If you're still running an older version of TagStudio in the meantime, you can access the legacy system by going to "Edit -> Manage File Extensions" in the menubar.
TagStudio offers the ability to ignore specific files and directories via a `.ts_ignore` file located inside your [library's](../library/index.md) `.TagStudio` folder. This file is designed to use very similar [glob](<https://en.wikipedia.org/wiki/Glob_(programming)>)-style pattern matching as the [`.gitignore`](https://git-scm.com/docs/gitignore) file used by Git™[^1]. It can be edited within TagStudio or opened to edit with an external program by going to the "Edit -> Ignore Files" option in the menubar.
This file is only referenced when scanning directories for new files to add to your library, and does not apply to files that have already been added to your library.
<!-- prettier-ignore -->
!!! tip
If you just want some specific examples of how to achieve common tasks with the ignore patterns (e.g. ignoring a single file type, ignoring a specific folder) then jump to the "[Use Cases](#use-cases)" section!
<!-- prettier-ignore-start -->
=== "Example .ts_ignore file"
```toml title="My Library/.TagStudio/.ts_ignore"
# TagStudio .ts_ignore file.
# Code
__pycache__
.pytest_cache
.venv
.vs
# Projects
Minecraft/**/Metadata
Minecraft/Website
!Minecraft/Website/*.png
!Minecraft/Website/*.css
# Documents
*.doc
*.docx
*.ppt
*.pptx
*.xls
*.xlsx
```
<!-- prettier-ignore-end -->
## Pattern Format
<!-- prettier-ignore -->
!!! note ""
_This section sourced and adapted from Git's[^1] `.gitignore` [documentation](https://git-scm.com/docs/gitignore)._
### Internal Processes
When scanning your library directories, the `.ts_ignore` file is read by either the [`wcmatch`](https://facelessuser.github.io/wcmatch/glob/) library or [`ripgrep`](https://github.com/BurntSushi/ripgrep) in glob mode depending if you have the later installed on your system and it's detected by TagStudio. Ripgrep is the preferred method for scanning directories due to its improved performance and identical pattern matching to `.gitignore`. This mixture of tools may lead to slight inconsistencies if not using `ripgrep`.
---
### Comments ( `#` )
A `#` symbol at the start of a line indicates that this line is a comment, and match no items. Blank lines are used to enhance readability and also match no items.
- Can be escaped by putting a backslash ("`\`") in front of the `#` symbol.
<!-- prettier-ignore-start -->
=== "Example comment"
```toml
# This is a comment! I can say whatever I want on this line.
file_that_is_being_matched.txt
# file_that_is_NOT_being_matched.png
file_that_is_being_matched.png
```
=== "Organizing with comments"
```toml
# TagStudio .ts_ignore file.
# Minecraft Stuff
Minecraft/**/Metadata
Minecraft/Website
!Minecraft/Website/*.png
!Minecraft/Website/*.css
# Microsoft Office
*.doc
*.docx
*.ppt
*.pptx
*.xls
*.xlsx
```
=== "Escape a # symbol"
```toml
# To ensure a file named '#hashtag.jpg' is ignored:
\#hashtag.jpg
```
<!-- prettier-ignore-end -->
---
### Directories ( `/` )
The forward slash "`/`" is used as the directory separator. Separators may occur at the beginning, middle or end of the `.ts_ignore` search pattern.
- If there is a separator at the beginning or middle (or both) of the pattern, then the pattern is relative to the directory level of the particular `.TagStudio` library folder itself. Otherwise the pattern may also match at any level below the `.TagStudio` folder level.
- If there is a separator at the end of the pattern then the pattern will only match directories, otherwise the pattern can match both files and directories.
<!-- prettier-ignore-start -->
=== "Example folder pattern"
```toml
# Matches "frotz" and "a/frotz" if they are directories.
frotz/
```
=== "Example nested folder pattern"
```toml
# Matches "doc/frotz" but not "a/doc/frotz".
doc/frotz/
```
<!-- prettier-ignore-end -->
---
### Negation ( `!` )
A `!` prefix before a pattern negates the pattern, allowing any files matched matched by previous patterns to be un-matched.
- Any matching file excluded by a previous pattern will become included again.
- **It is not possible to re-include a file if a parent directory of that file is excluded.**
<!-- prettier-ignore-start -->
=== "Example negation"
```toml
# All .jpg files will be ignored, except any located in the 'Photos' folder.
*.jpg
Photos/!*.jpg
```
=== "Escape a ! Symbol"
```toml
# To ensure a file named '!wowee.jpg' is ignored:
\!wowee.jpg
```
<!-- prettier-ignore-end -->
---
### Wildcards
#### Single Asterisks ( `*` )
An asterisk "`*`" matches anything except a slash.
<!-- prettier-ignore-start -->
=== "File examples"
```toml
# Matches all .png files in the "Images" folder.
Images/*.png
# Matches all .png files in all folders
*.png
```
=== "Folder examples"
```toml
# Matches any files or folders directly in "Images/" but not deeper levels.
# Matches file "Images/mario.jpg"
# Matches folder "Images/Mario"
# Does not match file "Images/Mario/cat.jpg"
Images/*
```
<!-- prettier-ignore-end -->
#### Question Marks ( `?` )
The character "`?`" matches any one character except "`/`".
<!-- prettier-ignore-start -->
=== "File examples"
```toml
# Matches any .png file starting with "IMG_" and ending in any four characters.
# Matches "IMG_0001.png"
# Matches "Photos/IMG_1234.png"
# Does not match "IMG_1.png"
IMG_????.png
# Same as above, except matches any file extension instead of only .png
IMG_????.*
```
=== "Folder examples"
```toml
# Matches all files in any direct subfolder of "Photos" beginning in "20".
# Matches "Photos/2000"
# Matches "Photos/2024"
# Matches "Photos/2099"
# Does not match "Photos/1995"
Photos/20??/
```
<!-- prettier-ignore-end -->
#### Double Asterisks ( `**` )
Two consecutive asterisks ("`**`") in patterns matched against full pathname may have special meaning:
- A leading "`**`" followed by a slash means matches in all directories.
- A trailing "`/**`" matches everything inside.
- A slash followed by two consecutive asterisks then a slash ("`/**/`") matches zero or more directories.
- Other consecutive asterisks are considered regular asterisks and will match according to the previous rules.
<!-- prettier-ignore-start -->
=== "Leading **"
```toml
# Both match file or directory "foo" anywhere
**/foo
foo
# Matches file or directory "bar" anywhere that is directly under directory "foo"
**/foo/bar
```
=== "Trailing /**"
```toml
# Matches all files inside directory "abc" with infinite depth.
abc/**
```
=== "Middle /**/"
```toml
# Matches "a/b", "a/x/b", "a/x/y/b" and so on.
a/**/b
```
<!-- prettier-ignore-end -->
#### Square Brackets ( `[a-Z]` )
Character sets and ranges are specific and powerful forms of wildcards that use characters inside of brackets (`[]`) to leverage very specific matching. The range notation, e.g. `[a-zA-Z]`, can be used to match one of the characters in a range.
<!-- prettier-ignore -->
!!! tip
For more in-depth examples and explanations on how to use ranges, please reference the [`glob`](https://man7.org/linux/man-pages/man7/glob.7.html) man page.
<!-- prettier-ignore-start -->
=== "Range examples"
```toml
# Matches all files that start with "IMG_" and end in a single numeric character.
# Matches "IMG_0.jpg", "IMG_7.png"
# Does not match "IMG_10.jpg", "IMG_A.jpg"
IMG_[0-9]
# Matches all files that start with "IMG_" and end in a single alphabetic character
IMG_[a-z]
```
=== "Set examples"
```toml
# Matches all files that start with "IMG_" and in any character in the set.
# Matches "draft_a.docx", "draft_b.docx", "draft_c.docx"
# Does not match "draft_d.docx"
draft_[abc]
# Matches all files that start with "IMG_" and end in a single alphabetic character
IMG_[a-z]
```
<!-- prettier-ignore-end -->
---
## Use Cases
### Ignoring Files by Extension
<!-- prettier-ignore -->
=== "Ignore all .jpg files"
```toml
*.jpg
```
=== "Ignore all files EXCEPT .jpg files"
```toml
*
!*.jpg
```
=== "Ignore all .jpg files in specific folders"
```toml
./Photos/Worst Vacation/*.jpg
Music/Artwork Art/*.jpg
```
<!-- prettier-ignore -->
!!! tip "Ensuring Complete Extension Matches"
For some filetypes, it may be nessisary to specify different casing and alternative spellings in order to match with all possible variations of an extension in your library.
```toml title="Ignore (Most) Possible JPEG File Extensions"
# The JPEG Cinematic Universe
*.jpg
*.jpeg
*.jfif
*.jpeg_large
*.JPG
*.JPEG
*.JFIF
*.JPEG_LARGE
```
### Ignoring a Folder
<!-- prettier-ignore -->
=== "Ignore all "Cache" folders"
```toml
# Matches any folder called "Cache" no matter where it is in your library.
cache/
```
=== "Ignore a "Downloads" folder"
```toml
# "Downloads" must be a folder on the same level as your ".TagStudio" folder.
# Does not match with folders name "Downloads" elsewhere in your library
# Does not match with a file called "Downloads"
/Downloads/
```
=== "Ignore .jpg files in specific folders"
```toml
Photos/Worst Vacation/*.jpg
/Music/Artwork Art/*.jpg
```
[^1]: The term "Git" is a licensed trademark of "The Git Project", a member of the Software Freedom Conservancy. Git is released under the [GNU General Public License version 2.0](https://opensource.org/license/GPL-2.0), an open source license. TagStudio is not associated with the Git Project, only including systems based on some therein.

View File

@@ -43,6 +43,7 @@ nav:
- library/tag_categories.md
- library/tag_color.md
- Utilities:
- utilities/ignore.md
- utilities/macro.md
- Updates:
- updates/changelog.md

View File

@@ -6,6 +6,7 @@
qt6,
stdenv,
wrapGAppsHook,
wcmatch,
pillow-jxl-plugin,
pyside6,

View File

@@ -30,6 +30,7 @@ dependencies = [
"toml~=0.10",
"typing_extensions~=4.13",
"ujson~=5.10",
"wcmatch==10.*",
]
[project.optional-dependencies]

View File

@@ -9,6 +9,7 @@ VERSION_BRANCH: str = "" # Usually "" or "Pre-Release"
TS_FOLDER_NAME: str = ".TagStudio"
BACKUP_FOLDER_NAME: str = "backups"
COLLAGE_FOLDER_NAME: str = "collages"
IGNORE_NAME: str = ".ts_ignore"
THUMB_CACHE_NAME: str = "thumbs"
FONT_SAMPLE_TEXT: str = (

View File

@@ -92,6 +92,7 @@ if TYPE_CHECKING:
logger = structlog.get_logger(__name__)
TAG_CHILDREN_QUERY = text("""
-- Note for this entire query that tag_parents.child_id is the parent id and tag_parents.parent_id is the child id due to bad naming
WITH RECURSIVE ChildTags AS (
@@ -659,7 +660,10 @@ class Library:
entry_stmt = (
entry_stmt.outerjoin(Entry.text_fields)
.outerjoin(Entry.datetime_fields)
.options(selectinload(Entry.text_fields), selectinload(Entry.datetime_fields))
.options(
selectinload(Entry.text_fields),
selectinload(Entry.datetime_fields),
)
)
# if with_tags:
# entry_stmt = entry_stmt.outerjoin(Entry.tags).options(selectinload(Entry.tags))
@@ -885,6 +889,7 @@ class Library:
"""
assert isinstance(search, BrowsingState)
assert self.engine
assert self.library_dir
with Session(self.engine, expire_on_commit=False) as session:
statement = select(Entry.id, func.count().over())
@@ -897,6 +902,7 @@ class Library:
f"SQL Expression Builder finished ({format_timespan(end_time - start_time)})"
)
# TODO: Remove this from the search function and update tests.
extensions = self.prefs(LibraryPrefs.EXTENSION_LIST)
is_exclude_list = self.prefs(LibraryPrefs.IS_EXCLUDE_LIST)
@@ -905,6 +911,8 @@ class Library:
elif extensions:
statement = statement.where(Entry.suffix.in_(extensions))
statement = statement.distinct(Entry.id)
sort_on: ColumnExpressionArgument = Entry.id
match search.sorting_mode:
case SortingModeEnum.DATE_ADDED:
@@ -1710,7 +1718,10 @@ class Library:
session.expunge(en)
return dict(
sorted(color_groups.items(), key=lambda kv: self.get_namespace_name(kv[0]).lower())
sorted(
color_groups.items(),
key=lambda kv: self.get_namespace_name(kv[0]).lower(),
)
)
@property

View File

@@ -0,0 +1,154 @@
# Copyright (C) 2025 Travis Abendshien (CyanVoxel).
# Licensed under the GPL-3.0 License.
# Created for TagStudio: https://github.com/CyanVoxel/TagStudio
from copy import deepcopy
from pathlib import Path
import structlog
import wcmatch.fnmatch as fnmatch
from wcmatch import glob, pathlib
from tagstudio.core.constants import IGNORE_NAME, TS_FOLDER_NAME
from tagstudio.core.singleton import Singleton
logger = structlog.get_logger()
PATH_GLOB_FLAGS = glob.GLOBSTARLONG | glob.DOTGLOB | glob.NEGATE | pathlib.MATCHBASE
GLOBAL_IGNORE = [
# TagStudio -------------------
f"{TS_FOLDER_NAME}",
# Trash -----------------------
".Trash-*",
".Trash",
".Trashes",
"$RECYCLE.BIN",
# System ----------------------
"._*",
".DS_Store",
".fseventsd",
".Spotlight-V100",
".TemporaryItems",
"desktop.ini",
"System Volume Information",
".localized",
]
def ignore_to_glob(ignore_patterns: list[str]) -> list[str]:
"""Convert .gitignore-like patterns to explicit glob syntax.
Args:
ignore_patterns (list[str]): The .gitignore-like patterns to convert.
"""
glob_patterns: list[str] = deepcopy(ignore_patterns)
additional_patterns: list[str] = []
# Mimic implicit .gitignore syntax behavior for the SQLite GLOB function.
for pattern in glob_patterns:
# Temporarily remove any exclusion character before processing
exclusion_char = ""
gp = pattern
if pattern.startswith("!"):
gp = pattern[1:]
exclusion_char = "!"
if not gp.startswith("**/") and not gp.startswith("*/") and not gp.startswith("/"):
# Create a version of a prefix-less pattern that starts with "**/"
gp = "**/" + gp
additional_patterns.append(exclusion_char + gp)
gp = gp.removesuffix("/**").removesuffix("/*").removesuffix("/")
additional_patterns.append(exclusion_char + gp)
gp = gp.removeprefix("**/").removeprefix("*/")
additional_patterns.append(exclusion_char + gp)
glob_patterns = glob_patterns + additional_patterns
# Add "/**" suffix to suffix-less patterns to match implicit .gitignore behavior.
for pattern in glob_patterns:
if pattern.endswith("/**"):
continue
glob_patterns.append(pattern.removesuffix("/*").removesuffix("/") + "/**")
glob_patterns = list(set(glob_patterns))
logger.info("[Ignore]", glob_patterns=glob_patterns)
return glob_patterns
class Ignore(metaclass=Singleton):
"""Class for processing and managing glob-like file ignore file patterns."""
_last_loaded: tuple[Path, float] | None = None
_patterns: list[str] = []
compiled_patterns: fnmatch.WcMatcher | None = None
@staticmethod
def get_patterns(library_dir: Path, include_global: bool = True) -> list[str]:
"""Get the ignore patterns for the given library directory.
Args:
library_dir (Path): The path of the library to load patterns from.
include_global (bool): Flag for including the global ignore set.
In most scenarios, this should be True.
"""
patterns = GLOBAL_IGNORE if include_global else []
ts_ignore_path = Path(library_dir / TS_FOLDER_NAME / IGNORE_NAME)
if not ts_ignore_path.exists():
logger.info(
"[Ignore] No .ts_ignore file found",
path=ts_ignore_path,
)
Ignore._last_loaded = None
Ignore._patterns = patterns
return Ignore._patterns
# Process the .ts_ignore file if the previous result is non-existent or outdated.
loaded = (ts_ignore_path, ts_ignore_path.stat().st_mtime)
if not Ignore._last_loaded or (Ignore._last_loaded and Ignore._last_loaded != loaded):
logger.info(
"[Ignore] Processing the .ts_ignore file...",
library=library_dir,
last_mtime=Ignore._last_loaded[1] if Ignore._last_loaded else None,
new_mtime=loaded[1],
)
Ignore._patterns = patterns + Ignore._load_ignore_file(ts_ignore_path)
Ignore.compiled_patterns = fnmatch.compile(
"*", PATH_GLOB_FLAGS, exclude=ignore_to_glob(Ignore._patterns)
)
else:
logger.info(
"[Ignore] No updates to the .ts_ignore detected",
library=library_dir,
last_mtime=Ignore._last_loaded[1],
new_mtime=loaded[1],
)
Ignore._last_loaded = loaded
return Ignore._patterns
@staticmethod
def _load_ignore_file(path: Path) -> list[str]:
"""Load and process the .ts_ignore file into a list of glob patterns.
Args:
path (Path): The path of the .ts_ignore file.
"""
patterns: list[str] = []
if path.exists():
with open(path, encoding="utf8") as f:
for line_raw in f.readlines():
line = line_raw.strip()
# Ignore blank lines and comments
if not line or line.startswith("#"):
continue
patterns.append(line)
return patterns

View File

@@ -243,7 +243,7 @@ class MediaCategories:
".sqlite",
".sqlite3",
}
_DISK_IMAGE_SET: set[str] = {".bios", ".dmg", ".iso"}
_DISK_IMAGE_SET: set[str] = {".bios", ".dmg", ".fhdx", ".iso"}
_DOCUMENT_SET: set[str] = {
".doc",
".docm",
@@ -413,6 +413,7 @@ class MediaCategories:
".mp4",
".webm",
".wmv",
".ts",
}
ADOBE_PHOTOSHOP_TYPES = MediaCategory(

View File

@@ -26,9 +26,11 @@ class UiColor(IntEnum):
THEME_DARK = 1
THEME_LIGHT = 2
RED = 3
GREEN = 4
BLUE = 5
PURPLE = 6
ORANGE = 4
AMBER = 5
GREEN = 6
BLUE = 7
PURPLE = 8
TAG_COLORS: dict[TagColorEnum, dict[ColorType, Any]] = {
@@ -54,6 +56,18 @@ UI_COLORS: dict[UiColor, dict[ColorType, Any]] = {
ColorType.LIGHT_ACCENT: "#f39caa",
ColorType.DARK_ACCENT: "#440d12",
},
UiColor.ORANGE: {
ColorType.PRIMARY: "#FF8020",
ColorType.BORDER: "#E86919",
ColorType.LIGHT_ACCENT: "#FFECB3",
ColorType.DARK_ACCENT: "#752809",
},
UiColor.AMBER: {
ColorType.PRIMARY: "#FFC107",
ColorType.BORDER: "#FFD54F",
ColorType.LIGHT_ACCENT: "#FFECB3",
ColorType.DARK_ACCENT: "#772505",
},
UiColor.GREEN: {
ColorType.PRIMARY: "#28bb48",
ColorType.BORDER: "#43c568",

View File

@@ -3,10 +3,11 @@ from dataclasses import dataclass, field
from pathlib import Path
import structlog
from wcmatch import pathlib
from tagstudio.core.library.alchemy.library import Library
from tagstudio.core.library.alchemy.models import Entry
from tagstudio.core.utils.refresh_dir import GLOBAL_IGNORE_SET
from tagstudio.core.library.ignore import PATH_GLOB_FLAGS, Ignore
logger = structlog.get_logger()
@@ -25,7 +26,9 @@ class MissingRegistry:
def refresh_missing_files(self) -> Iterator[int]:
"""Track the number of entries that point to an invalid filepath."""
assert self.library.library_dir
logger.info("[refresh_missing_files] Refreshing missing files...")
self.missing_file_entries = []
for i, entry in enumerate(self.library.all_entries()):
full_path = self.library.library_dir / entry.path
@@ -38,16 +41,15 @@ class MissingRegistry:
Works if files were just moved to different subfolders and don't have duplicate names.
"""
matches = []
for path in self.library.library_dir.glob(f"**/{match_entry.path.name}"):
# Ensure matched file isn't in a globally ignored folder
skip: bool = False
for part in path.parts:
if part in GLOBAL_IGNORE_SET:
skip = True
break
if skip:
continue
assert self.library.library_dir
matches: list[Path] = []
ignore_patterns = Ignore.get_patterns(self.library.library_dir)
for path in pathlib.Path(str(self.library.library_dir)).glob(
f"***/{match_entry.path.name}",
flags=PATH_GLOB_FLAGS,
exclude=ignore_patterns,
):
if path.name == match_entry.path.name:
new_path = Path(path).relative_to(self.library.library_dir)
matches.append(new_path)

View File

@@ -1,3 +1,4 @@
import shutil
from collections.abc import Iterator
from dataclasses import dataclass, field
from datetime import datetime as dt
@@ -5,27 +6,15 @@ from pathlib import Path
from time import time
import structlog
from wcmatch import pathlib
from tagstudio.core.constants import TS_FOLDER_NAME
from tagstudio.core.library.alchemy.library import Library
from tagstudio.core.library.alchemy.models import Entry
from tagstudio.core.library.ignore import PATH_GLOB_FLAGS, Ignore, ignore_to_glob
from tagstudio.qt.helpers.silent_popen import silent_run
logger = structlog.get_logger(__name__)
GLOBAL_IGNORE_SET: set[str] = set(
[
TS_FOLDER_NAME,
"$RECYCLE.BIN",
".Trashes",
".Trash",
"tagstudio_thumbs",
".fseventsd",
".Spotlight-V100",
"System Volume Information",
".DS_Store",
]
)
@dataclass
class RefreshDirTracker:
@@ -42,7 +31,7 @@ class RefreshDirTracker:
entries = [
Entry(
path=entry_path,
folder=self.library.folder,
folder=self.library.folder, # pyright: ignore[reportArgumentType]
fields=[],
date_added=dt.now(),
)
@@ -54,18 +43,81 @@ class RefreshDirTracker:
yield
def refresh_dir(self, lib_path: Path) -> Iterator[int]:
"""Scan a directory for files, and add those relative filenames to internal variables."""
def refresh_dir(self, library_dir: Path, force_internal_tools: bool = False) -> Iterator[int]:
"""Scan a directory for files, and add those relative filenames to internal variables.
Args:
library_dir (Path): The library directory.
force_internal_tools (bool): Option to force the use of internal tools for scanning
(i.e. wcmatch) instead of using tools found on the system (i.e. ripgrep).
"""
if self.library.library_dir is None:
raise ValueError("No library directory set.")
ignore_patterns = Ignore.get_patterns(library_dir)
if force_internal_tools:
return self.__wc_add(library_dir, ignore_to_glob(ignore_patterns))
dir_list: list[str] | None = self.__get_dir_list(library_dir, ignore_patterns)
# Use ripgrep if it was found and working, else fallback to wcmatch.
if dir_list is not None:
return self.__rg_add(library_dir, dir_list)
else:
return self.__wc_add(library_dir, ignore_to_glob(ignore_patterns))
def __get_dir_list(self, library_dir: Path, ignore_patterns: list[str]) -> list[str] | None:
"""Use ripgrep to return a list of matched directories and files.
Return `None` if ripgrep not found on system.
"""
rg_path = shutil.which("rg")
# Use ripgrep if found on system
if rg_path is not None:
logger.info("[Refresh: Using ripgrep for scanning]")
compiled_ignore_path = library_dir / ".TagStudio" / ".compiled_ignore"
# Write compiled ignore patterns (built-in + user) to a temp file to pass to ripgrep
with open(compiled_ignore_path, "w") as pattern_file:
pattern_file.write("\n".join(ignore_patterns))
result = silent_run(
" ".join(
[
"rg",
"--files",
"--follow",
"--hidden",
"--ignore-file",
f'"{str(compiled_ignore_path)}"',
]
),
cwd=library_dir,
capture_output=True,
text=True,
shell=True,
)
compiled_ignore_path.unlink()
if result.stderr:
logger.error(result.stderr)
return result.stdout.splitlines() # pyright: ignore [reportReturnType]
logger.warning("[Refresh: ripgrep not found on system]")
return None
def __rg_add(self, library_dir: Path, dir_list: list[str]) -> Iterator[int]:
start_time_total = time()
start_time_loop = time()
self.files_not_in_library = []
dir_file_count = 0
self.files_not_in_library = []
for r in dir_list:
f = pathlib.Path(r)
for f in lib_path.glob("**/*"):
end_time_loop = time()
# Yield output every 1/30 of a second
if (end_time_loop - start_time_loop) > 0.034:
@@ -81,31 +133,62 @@ class RefreshDirTracker:
if f.is_dir():
continue
# Ensure new file isn't in a globally ignored folder
skip: bool = False
for part in f.parts:
# NOTE: Files starting with "._" are sometimes generated by macOS Finder.
# More info: https://lists.apple.com/archives/applescript-users/2006/Jun/msg00180.html
if part.startswith("._") or part in GLOBAL_IGNORE_SET:
skip = True
break
if skip:
dir_file_count += 1
self.library.included_files.add(f)
if not self.library.has_path_entry(f):
self.files_not_in_library.append(f)
end_time_total = time()
yield dir_file_count
logger.info(
"[Refresh]: Directory scan time",
path=library_dir,
duration=(end_time_total - start_time_total),
files_scanned=dir_file_count,
tool_used="ripgrep (system)",
)
def __wc_add(self, library_dir: Path, ignore_patterns: list[str]) -> Iterator[int]:
start_time_total = time()
start_time_loop = time()
dir_file_count = 0
self.files_not_in_library = []
logger.info("[Refresh]: Falling back to wcmatch for scanning")
for f in pathlib.Path(str(library_dir)).glob(
"***/*", flags=PATH_GLOB_FLAGS, exclude=ignore_patterns
):
end_time_loop = time()
# Yield output every 1/30 of a second
if (end_time_loop - start_time_loop) > 0.034:
yield dir_file_count
start_time_loop = time()
# Skip if the file/path is already mapped in the Library
if f in self.library.included_files:
dir_file_count += 1
continue
# Ignore if the file is a directory
if f.is_dir():
continue
dir_file_count += 1
self.library.included_files.add(f)
relative_path = f.relative_to(lib_path)
# TODO - load these in batch somehow
relative_path = f.relative_to(library_dir)
if not self.library.has_path_entry(relative_path):
self.files_not_in_library.append(relative_path)
end_time_total = time()
yield dir_file_count
logger.info(
"Directory scan time",
path=lib_path,
"[Refresh]: Directory scan time",
path=library_dir,
duration=(end_time_total - start_time_total),
files_not_in_lib=self.files_not_in_library,
files_scanned=dir_file_count,
tool_used="wcmatch (internal)",
)

View File

@@ -3,7 +3,9 @@
# Created for TagStudio: https://github.com/CyanVoxel/TagStudio
import os
from pathlib import Path
from platform import system
import structlog
from send2trash import send2trash
@@ -23,9 +25,32 @@ def delete_file(path: str | Path) -> bool:
send2trash(_path)
return True
except PermissionError as e:
logger.error(f"[delete_file][ERROR] PermissionError: {e}")
logger.error(f"[delete_file] PermissionError: {e}")
except FileNotFoundError:
logger.error(f"[delete_file][ERROR] File Not Found: {_path}")
logger.error(f"[delete_file] File Not Found: {_path}")
except OSError as e:
if system() == "Darwin" and _path.exists():
logger.info(
f'[delete_file] Encountered "{e}" on macOS and file exists; '
"Assuming it's on a network volume and proceeding to delete..."
)
return _hard_delete_file(_path)
else:
logger.error("[delete_file] OSError", error=e)
except Exception as e:
logger.error(e)
logger.error("[delete_file] Unknown Error", error_type=type(e).__name__, error=e)
return False
def _hard_delete_file(path: Path) -> bool:
"""Hard delete a file from the system. Does NOT send to system trash.
Args:
path (str | Path): The path of the file to delete.
"""
try:
os.remove(path)
return True
except Exception as e:
logger.error("[hard_delete_file] Error", error_type=type(e).__name__, error=e)
return False

View File

@@ -86,7 +86,7 @@ def silent_Popen( # noqa: N802
)
def silent_run( # noqa: N802
def silent_run(
args,
bufsize=-1,
executable=None,

View File

@@ -30,6 +30,10 @@
"broken_link_icon": {
"path": "qt/images/broken_link_icon.png",
"mode": "pil"
},
"ignored": {
"path": "qt/images/ignored_128.png",
"mode": "pil"
},
"adobe_illustrator": {
"path": "qt/images/file_icons/adobe_illustrator.png",

View File

@@ -60,6 +60,7 @@ from tagstudio.core.library.alchemy.enums import (
from tagstudio.core.library.alchemy.fields import _FieldID
from tagstudio.core.library.alchemy.library import Library, LibraryStatus
from tagstudio.core.library.alchemy.models import Entry
from tagstudio.core.library.ignore import Ignore
from tagstudio.core.media_types import MediaCategories
from tagstudio.core.palette import ColorType, UiColor, get_ui_color
from tagstudio.core.query_lang.util import ParsingError
@@ -994,6 +995,7 @@ class QtDriver(DriverMixin, QObject):
def add_new_files_callback(self):
"""Run when user initiates adding new files to the Library."""
assert self.lib.library_dir
tracker = RefreshDirTracker(self.lib)
pw = ProgressWidget(
@@ -1003,10 +1005,9 @@ class QtDriver(DriverMixin, QObject):
)
pw.setWindowTitle(Translations["library.refresh.title"])
pw.update_label(Translations["library.refresh.scanning_preparing"])
pw.show()
iterator = FunctionIterator(lambda: tracker.refresh_dir(self.lib.library_dir))
iterator = FunctionIterator(lambda lib=self.lib.library_dir: tracker.refresh_dir(lib))
iterator.value.connect(
lambda x: (
pw.update_progress(x + 1),
@@ -1562,6 +1563,7 @@ class QtDriver(DriverMixin, QObject):
# search the library
start_time = time.time()
Ignore.get_patterns(self.lib.library_dir, include_global=True)
results = self.lib.search_library(self.browsing_history.current, self.settings.page_size)
logger.info("items to render", count=len(results))
end_time = time.time()
@@ -1706,8 +1708,9 @@ class QtDriver(DriverMixin, QObject):
)
return open_status
assert self.lib.library_dir
self.init_workers()
Ignore.get_patterns(self.lib.library_dir, include_global=True)
self.__reset_navigation()
# TODO - make this call optional

View File

@@ -20,7 +20,9 @@ from PySide6.QtWidgets import QLabel, QVBoxLayout, QWidget
from tagstudio.core.enums import ShowFilepathOption, Theme
from tagstudio.core.library.alchemy.library import Library
from tagstudio.core.library.ignore import Ignore
from tagstudio.core.media_types import MediaCategories
from tagstudio.core.palette import ColorType, UiColor, get_ui_color
from tagstudio.qt.helpers.file_opener import FileOpenerHelper, FileOpenerLabel
from tagstudio.qt.translations import Translations
@@ -207,12 +209,30 @@ class FileAttributes(QWidget):
# Format and display any stat variables
def add_newline(stats_label_text: str) -> str:
if stats_label_text and stats_label_text[-2:] != "\n":
return stats_label_text + "\n"
if stats_label_text and stats_label_text[-4:] != "<br>":
return stats_label_text + "<br>"
return stats_label_text
if ext_display:
stats_label_text += ext_display
assert self.library.library_dir
red = get_ui_color(ColorType.PRIMARY, UiColor.RED)
orange = get_ui_color(ColorType.PRIMARY, UiColor.ORANGE)
if Ignore.compiled_patterns and not Ignore.compiled_patterns.match(
filepath.relative_to(self.library.library_dir)
):
stats_label_text = (
f"{stats_label_text}"
f" • <span style='color:{orange}'>"
f"{Translations['preview.ignored'].upper()}</span>"
)
if not filepath.exists():
stats_label_text = (
f"{stats_label_text}"
f" • <span style='color:{red}'>"
f"{Translations['preview.unlinked'].upper()}</span>"
)
if file_size:
stats_label_text += f"{file_size}"
elif file_size:

View File

@@ -55,8 +55,9 @@ from tagstudio.core.constants import (
TS_FOLDER_NAME,
)
from tagstudio.core.exceptions import NoRendererError
from tagstudio.core.library.ignore import Ignore
from tagstudio.core.media_types import MediaCategories, MediaType
from tagstudio.core.palette import ColorType, UiColor, get_ui_color
from tagstudio.core.palette import UI_COLORS, ColorType, UiColor, get_ui_color
from tagstudio.core.utils.encoding import detect_char_encoding
from tagstudio.qt.cache_manager import CacheManager
from tagstudio.qt.helpers.blender_thumbnailer import blend_thumb
@@ -184,7 +185,14 @@ class ThumbRenderer(QObject):
return item
def _get_icon(
self, name: str, color: UiColor, size: tuple[int, int], pixel_ratio: float = 1.0
self,
name: str,
color: UiColor,
size: tuple[int, int],
pixel_ratio: float = 1.0,
bg_image: Image.Image | None = None,
draw_edge: bool = True,
is_corner: bool = False,
) -> Image.Image:
"""Return an icon given a size, pixel ratio, and radius scaling option.
@@ -193,6 +201,9 @@ class ThumbRenderer(QObject):
color (str): The color to use for the icon.
size (tuple[int,int]): The size of the icon.
pixel_ratio (float): The screen pixel ratio.
bg_image (Image.Image): Optional background image to go behind the icon.
draw_edge (bool): Flag for is the raised edge should be drawn.
is_corner (bool): Flag for is the icon should render with the "corner" style
"""
draw_border: bool = True
if name == "thumb_loading":
@@ -200,10 +211,17 @@ class ThumbRenderer(QObject):
item: Image.Image | None = self.icons.get((name, color, *size, pixel_ratio))
if not item:
item_flat: Image.Image = self._render_icon(name, color, size, pixel_ratio, draw_border)
edge: tuple[Image.Image, Image.Image] = self._get_edge(size, pixel_ratio)
item = self._apply_edge(item_flat, edge, faded=True)
self.icons[(name, color, *size, pixel_ratio)] = item
item_flat: Image.Image = (
self._render_corner_icon(name, color, size, pixel_ratio, bg_image)
if is_corner
else self._render_center_icon(name, color, size, pixel_ratio, draw_border, bg_image)
)
if draw_edge:
edge: tuple[Image.Image, Image.Image] = self._get_edge(size, pixel_ratio)
item = self._apply_edge(item_flat, edge, faded=True)
self.icons[(name, color, *size, pixel_ratio)] = item
else:
item = item_flat
return item
def _render_mask(
@@ -289,13 +307,14 @@ class ThumbRenderer(QObject):
return (im_hl, im_sh)
def _render_icon(
def _render_center_icon(
self,
name: str,
color: UiColor,
size: tuple[int, int],
pixel_ratio: float,
draw_border: bool = True,
bg_image: Image.Image | None = None,
) -> Image.Image:
"""Render a thumbnail icon.
@@ -305,6 +324,7 @@ class ThumbRenderer(QObject):
size (tuple[int,int]): The size of the icon.
pixel_ratio (float): The screen pixel ratio.
draw_border (bool): Option to draw a border.
bg_image (Image.Image): Optional background image to go behind the icon.
"""
border_factor: int = 5
smooth_factor: int = math.ceil(2 * pixel_ratio)
@@ -315,16 +335,23 @@ class ThumbRenderer(QObject):
im: Image.Image = Image.new(
"RGBA",
size=tuple([d * smooth_factor for d in size]), # type: ignore
color="#00000000",
color="#FF000000",
)
# Create solid background color
bg: Image.Image = Image.new(
bg: Image.Image
bg = Image.new(
"RGB",
size=tuple([d * smooth_factor for d in size]), # type: ignore
color="#000000",
color="#000000FF",
)
# Use a background image if provided
if bg_image:
bg_im = Image.Image.resize(bg_image, size=tuple([d * smooth_factor for d in size])) # type: ignore
bg_im = ImageEnhance.Brightness(bg_im).enhance(0.3) # Reduce the brightness
bg.paste(bg_im)
# Paste background color with rounded rectangle mask onto blank image
im.paste(
bg,
@@ -343,7 +370,7 @@ class ThumbRenderer(QObject):
radius=math.ceil(
(radius_factor * smooth_factor * pixel_ratio) + (pixel_ratio * 1.5)
),
fill="black",
fill=None if bg_image else "black",
outline="#FF0000",
width=math.floor(
(border_factor * smooth_factor * pixel_ratio) - (pixel_ratio * 1.5)
@@ -362,7 +389,7 @@ class ThumbRenderer(QObject):
)
# Get icon by name
icon: Image.Image = self.rm.get(name)
icon: Image.Image | None = self.rm.get(name)
if not icon:
icon = self.rm.get("file_generic")
if not icon:
@@ -389,6 +416,105 @@ class ThumbRenderer(QObject):
return im
def _render_corner_icon(
self,
name: str,
color: UiColor,
size: tuple[int, int],
pixel_ratio: float,
bg_image: Image.Image | None = None,
) -> Image.Image:
"""Render a thumbnail icon with the icon in the upper-left corner.
Args:
name (str): The name of the icon resource.
color (UiColor): The color to use for the icon.
size (tuple[int,int]): The size of the icon.
pixel_ratio (float): The screen pixel ratio.
draw_border (bool): Option to draw a border.
bg_image (Image.Image): Optional background image to go behind the icon.
"""
smooth_factor: int = math.ceil(2 * pixel_ratio)
icon_ratio: float = 5
padding_factor = 18
# Create larger blank image based on smooth_factor
im: Image.Image = Image.new(
"RGBA",
size=tuple([d * smooth_factor for d in size]), # type: ignore
color="#00000000",
)
bg: Image.Image
# Use a background image if provided
if bg_image:
bg = Image.Image.resize(bg_image, size=tuple([d * smooth_factor for d in size])) # type: ignore
# Create solid background color
else:
bg = Image.new(
"RGB",
size=tuple([d * smooth_factor for d in size]), # type: ignore
color="#000000",
)
# Apply color overlay
bg = self._apply_overlay_color(
im,
color,
)
# Paste background color with rounded rectangle mask onto blank image
im.paste(
bg,
(0, 0),
mask=self._get_mask(
tuple([d * smooth_factor for d in size]), # type: ignore
(pixel_ratio * smooth_factor),
),
)
colors = UI_COLORS.get(color) or UI_COLORS[UiColor.DEFAULT]
primary_color = colors.get(ColorType.PRIMARY)
# Resize image to final size
im = im.resize(
size,
resample=Image.Resampling.BILINEAR,
)
fg: Image.Image = Image.new(
"RGB",
size=size,
color=primary_color,
)
# Get icon by name
icon: Image.Image | None = self.rm.get(name)
if not icon:
icon = self.rm.get("file_generic")
if not icon:
icon = Image.new(mode="RGBA", size=(32, 32), color="magenta")
# Resize icon to fit icon_ratio
icon = icon.resize(
(
math.ceil(size[0] // icon_ratio),
math.ceil(size[1] // icon_ratio),
)
)
# Paste icon
im.paste(
im=fg.resize(
(
math.ceil(size[0] // icon_ratio),
math.ceil(size[1] // icon_ratio),
)
),
box=(size[0] // padding_factor, size[1] // padding_factor),
mask=icon.getchannel(3),
)
return im
def _apply_overlay_color(self, image: Image.Image, color: UiColor) -> Image.Image:
"""Apply a color overlay effect to an image based on its color channel data.
@@ -498,9 +624,10 @@ class ThumbRenderer(QObject):
if artwork:
image = artwork
except (
FileNotFoundError,
id3.ID3NoHeaderError,
mp4.MP4MetadataError,
mp4.MP4StreamInfoError,
id3.ID3NoHeaderError,
MutagenError,
) as e:
logger.error("Couldn't read album artwork", path=filepath, error=type(e).__name__)
@@ -644,7 +771,7 @@ class ThumbRenderer(QObject):
vtf = srctools.VTF.read(f)
im = vtf.get(frame=0).to_PIL()
except ValueError as e:
except (ValueError, FileNotFoundError) as e:
logger.error("Couldn't render thumbnail", filepath=filepath, error=type(e).__name__)
return im
@@ -879,6 +1006,7 @@ class ThumbRenderer(QObject):
im = new_bg
im = ImageOps.exif_transpose(im)
except (
FileNotFoundError,
UnidentifiedImageError,
DecompressionBombError,
NotImplementedError,
@@ -1076,6 +1204,7 @@ class ThumbRenderer(QObject):
DecompressionBombError,
UnicodeDecodeError,
OSError,
FileNotFoundError,
) as e:
logger.error("Couldn't render thumbnail", filepath=filepath, error=type(e).__name__)
return im
@@ -1165,15 +1294,48 @@ class ThumbRenderer(QObject):
)
return im
def render_unlinked(size: tuple[int, int], pixel_ratio: float) -> Image.Image:
def render_unlinked(
size: tuple[int, int], pixel_ratio: float, cached_im: Image.Image | None = None
) -> Image.Image:
im = self._get_icon(
name="broken_link_icon",
color=UiColor.RED,
size=size,
pixel_ratio=pixel_ratio,
bg_image=cached_im,
draw_edge=not cached_im,
is_corner=False,
)
return im
def render_ignored(
size: tuple[int, int], pixel_ratio: float, im: Image.Image
) -> Image.Image:
icon_ratio: float = 5
padding_factor = 18
im_ = im
icon: Image.Image = self.rm.get("ignored") # pyright: ignore[reportAssignmentType]
icon = icon.resize(
(
math.ceil(size[0] // icon_ratio),
math.ceil(size[1] // icon_ratio),
)
)
im_.paste(
im=icon.resize(
(
math.ceil(size[0] // icon_ratio),
math.ceil(size[1] // icon_ratio),
)
),
box=(size[0] // padding_factor, size[1] // padding_factor),
)
return im_
def fetch_cached_image(folder: Path):
image: Image.Image | None = None
cached_path: Path | None = None
@@ -1276,6 +1438,19 @@ class ThumbRenderer(QObject):
four_corner_gradient(image, (adj_size, adj_size), mask), edge
)
# Check if the file is supposed to be ignored and render an overlay if needed
try:
if (
image
and Ignore.compiled_patterns
and not Ignore.compiled_patterns.match(
filepath.relative_to(self.lib.library_dir)
)
):
image = render_ignored((adj_size, adj_size), pixel_ratio, image)
except TypeError:
pass
# A loading thumbnail (cached in memory)
elif is_loading:
# Initialize "Loading" thumbnail
@@ -1354,9 +1529,6 @@ class ThumbRenderer(QObject):
if _filepath:
try:
# Missing Files ================================================
if not _filepath.exists():
raise FileNotFoundError
ext: str = _filepath.suffix.lower() if _filepath.suffix else _filepath.stem.lower()
# Images =======================================================
if MediaCategories.is_ext_in_category(
@@ -1451,8 +1623,6 @@ class ThumbRenderer(QObject):
if save_to_file and savable_media_type and image:
ThumbRenderer.cache.save_image(image, save_to_file, mode="RGBA")
except FileNotFoundError:
image = None
except (
UnidentifiedImageError,
DecompressionBombError,

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.1 KiB

View File

@@ -226,8 +226,10 @@
"namespace.create.title": "Create Namespace",
"namespace.new.button": "New Namespace",
"namespace.new.prompt": "Create a New Namespace to Start Adding Custom Colors!",
"preview.ignored": "Ignored",
"preview.multiple_selection": "<b>{count}</b> Items Selected",
"preview.no_selection": "No Items Selected",
"preview.unlinked": "Unlinked",
"select.add_tag_to_selected": "Add Tag to Selected",
"select.all": "Select All",
"select.clear": "Clear Selection",

View File

@@ -7,7 +7,7 @@ CWD = Path(__file__).parent
def test_refresh_dupe_files(library):
library.library_dir = "/tmp/"
library.library_dir = Path("/tmp/")
entry = Entry(
folder=library.folder,
path=Path("bar/foo.txt"),

View File

@@ -19,8 +19,6 @@ def test_refresh_new_files(library, exclude_mode):
library.included_files.clear()
(library.library_dir / "FOO.MD").touch()
# When
assert len(list(registry.refresh_dir(library.library_dir))) == 1
# Then
# Test if the single file was added
list(registry.refresh_dir(library.library_dir, force_internal_tools=True))
assert registry.files_not_in_library == [Path("FOO.MD")]

View File

@@ -1,12 +1,16 @@
import pytest
import structlog
from tagstudio.core.library.alchemy.enums import BrowsingState
from tagstudio.core.library.alchemy.library import Library
from tagstudio.core.query_lang.util import ParsingError
logger = structlog.get_logger()
def verify_count(lib: Library, query: str, count: int):
results = lib.search_library(BrowsingState.from_search_query(query), page_size=500)
logger.info("results", entry_ids=results.ids, count=results.total_count)
assert results.total_count == count
assert len(results.ids) == count