55 Commits

Author SHA1 Message Date
KnugiHK
bf230db595 Gracefully handle bytes that can't be decoded from db (#44) 2026-01-20 23:35:05 +08:00
KnugiHK
242e8ee43a Fix regressions introduced in 194ed29 (default template swap)
This commit restores the logic originally introduced in:

* 265afc1
* 8cf1071
* 177b936
2026-01-20 01:42:30 +08:00
lifnej
c32096b26b Show sql errors if DEBUG flag is set. 2026-01-20 00:07:04 +08:00
lifnej
4aa1c26232 Missing newline in vcard info log. 2026-01-20 00:06:38 +08:00
KnugiHK
feca9ae8e0 Fix error on database without jid_map table
I realized the `jid_map` table might be missing after reviewing @lifnej's work in ee7db80. This fix adds use the preflight check result for the table before querying it.

I plan to apply this same pattern to other sections where `jid_map` is used.
2026-01-19 22:59:19 +08:00
KnugiHK
92c325294c Add preflight check to see if the jid_map table exists 2026-01-19 22:53:29 +08:00
KnugiHK
7dbd0dbe3c Add preflight check to see if transciption column exists 2026-01-19 22:46:30 +08:00
KnugiHK
035e61c4d7 Fix incremental merge CI 2026-01-19 21:31:23 +08:00
KnugiHK
96d323e0ed Fetch sender_timestamp for future use
WhatsApp doesn't show when a reaction was made, and I don't want to mess with a popup in the HTML yet. Let’s just fetch the data for now. It might come in handy later.

Credit to @tlcameron3 from #79
2026-01-19 21:28:50 +08:00
Knugi
35ad2559d7 Merge pull request #193 from m1ndy/feature/export-reactions
feat: Add support for exporting message reactions
2026-01-19 20:53:18 +08:00
KnugiHK
8058ed8219 Add tqdm progress bar 2026-01-19 20:49:14 +08:00
KnugiHK
908d8f71ca Fix merge conflict error 2026-01-19 20:41:45 +08:00
Knugi
f2b6a39011 Merge branch 'dev' into feature/export-reactions 2026-01-19 20:38:20 +08:00
KnugiHK
4f531ec52a Reverting the __version__ handle
See my comment at https://github.com/KnugiHK/WhatsApp-Chat-Exporter/pull/193/changes
2026-01-19 20:36:18 +08:00
KnugiHK
b69f645ac3 Adopt the same lid mapping to all sql query
Because the chat filter needs it
2026-01-19 20:29:56 +08:00
KnugiHK
f8b959e1e1 Implement an on-the-fly fix of dot-ending files (#185) 2026-01-18 23:03:49 +08:00
KnugiHK
9be210f34a Implement voice message transcription for Android (#159) 2026-01-18 21:59:03 +08:00
KnugiHK
ae7ba3da96 action_type 58 is actually shared with unblocking 2026-01-18 21:53:36 +08:00
KnugiHK
00e58ce2c9 Handle group message sender lid mapping (#188) 2026-01-18 21:25:40 +08:00
KnugiHK
4245ecc615 Update android_handler.py 2026-01-17 15:07:16 +08:00
KnugiHK
68dcc6abe0 Improve brute-force offsets with process pool
Refactored the brute-force offset search in `_decrypt_crypt14` to use `ProcessPoolExecutor` for better parallelism and performance. Improved progress reporting and clean shutdown on success or interruption.
2026-01-17 14:43:51 +08:00
KnugiHK
c05e76569b Add more chat type 2026-01-17 13:55:16 +08:00
KnugiHK
a6fe0d93b1 Rename the obj variable to json_obj in telegram_json_format 2026-01-17 13:54:56 +08:00
KnugiHK
2d096eff4d Add tqdm as dependency 2026-01-17 13:45:39 +08:00
KnugiHK
ea9675973c Refactor Message class to accept pre-initialized Timing object
Pass the `Timing` object directly through `timezone_offset` to avoid repeated initialization of the same object within the `Message` class.
2026-01-17 13:42:11 +08:00
KnugiHK
064b923cfa Convert time unit for progress 2026-01-17 13:22:56 +08:00
KnugiHK
cd35ffc185 Remove the prompt after user enter the password 2026-01-17 13:19:10 +08:00
KnugiHK
05bd26b8ed Decrease the default brute force worker to 4 2026-01-17 13:18:49 +08:00
KnugiHK
d200130335 Refactor to use tqdm for showing progress 2026-01-17 13:18:31 +08:00
KnugiHK
1c7d6f7912 Update README.md 2026-01-14 02:10:05 +08:00
KnugiHK
94960e4a23 Add iphone_backup_decrypt as an optional dependency (#123)
to make managing dependency easier
2026-01-14 02:07:10 +08:00
KnugiHK
79578d867f Handle new LID mapping #188, #144, #168
Implements the latest LID mapping changes. This should fully addresses #188 and likely resolves #144 (validation required). Note: A successful fix for #144 deprecates the pending workaround in #168. Additionally, resolved a bug where chat filters were not working for  newly created chat rooms.
2026-01-13 01:52:58 +08:00
KnugiHK
6910cc46a4 Update android_handler.py 2026-01-12 22:55:51 +08:00
KnugiHK
9e0457e720 Adjust the reaction to be render on the bottom left/right corner
This makes the reaction match WhatsApp's theme.
2026-01-12 22:54:05 +08:00
KnugiHK
e0967a3104 Defer reaction logging until table existence is confirmed
Moved the "Processing reactions..." log entry to occur after the `message_add_on` table check. This prevents the log from appearing on the old WhatsApp schema
2026-01-12 22:23:16 +08:00
KnugiHK
db50f24dd8 Minor formats 2026-01-12 22:19:59 +08:00
Cosmo
75fcf33fda feat: Add support for exporting message reactions 2026-01-11 07:06:23 -08:00
KnugiHK
0ba81e0863 Implement granular error handling
Added and improved layered Zlib and SQLite header checks to distinguish between authentication failures (wrong key) and data corruption.
2026-01-08 23:59:31 +08:00
KnugiHK
647e406ac0 Implement early key validation via authenticated decryption (#190)
Utilize `decrypt_and_verify` to immediately identify incorrect user-provided keys via GCM tag validation.
2026-01-08 23:57:02 +08:00
KnugiHK
9cedcf1767 Create conftest to oves test_nuitka_binary.py to the end of testing
Moves test_nuitka_binary.py to the end and fails if the file is missing.
2026-01-06 23:00:36 +08:00
KnugiHK
93a020f68d Merge branch 'dev' 2026-01-06 21:19:22 +08:00
KnugiHK
401abfb732 Bump version 2026-01-06 21:19:09 +08:00
KnugiHK
3538c81605 Enhance qouted message resolution to include media caption
Modified the `reply_query` to support messages that may not have body text but contain media caption.
2026-01-06 20:59:51 +08:00
KnugiHK
5a20953a81 Optimize quoted message lookups via global in-memory mapping
This change replaces the inefficient N+1 SQL query pattern with a pre-computed hash map. By fetching `ZSTANZAID` and `ZTEXT` pairs globally before processing, the exporter can resolve quoted message content in O(1) time.

Crucially, this maintains parity with the Android exporter by ensuring that replies to messages outside the current date or chat filters are still correctly rendered, providing full conversational context without the performance penalty of repeated database hits.
2026-01-06 20:51:29 +08:00
KnugiHK
8f29fa0505 Center the version string in the exporter banner 2026-01-06 20:35:02 +08:00
KnugiHK
0a14da9108 Reduce CI platforms 2026-01-05 00:31:47 +08:00
KnugiHK
929534ff80 Add windows 11 arm and macos 15 intel to CI 2026-01-05 00:17:00 +08:00
KnugiHK
87c1555f03 Add windows 11 arm and macos x64 to binary compiling 2026-01-05 00:02:52 +08:00
Knugi
fd325b6b59 Update generate-website.yml 2026-01-04 05:51:15 +00:00
Knugi
17e927ffd6 Update README.md 2026-01-02 05:35:59 +00:00
Knugi
5b488359c8 Update README.md 2026-01-02 05:32:39 +00:00
Knugi
d2186447c6 Update README.md 2026-01-02 05:30:22 +00:00
Knugi
82abf7d874 Add Verifying Build Integrity section 2026-01-02 04:53:52 +00:00
Knugi
5e676f2663 Merge pull request #187 from KnugiHK/alert-autofix-4
Potential fix for code scanning alert no. 4: Workflow does not contain permissions
2026-01-02 12:39:56 +08:00
Knugi
5da2772112 Potential fix for code scanning alert no. 4: Workflow does not contain permissions
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2026-01-02 04:39:37 +00:00
16 changed files with 1110 additions and 538 deletions

View File

@@ -8,18 +8,24 @@ on:
jobs: jobs:
ci: ci:
runs-on: ${{ matrix.os }} runs-on: ${{ matrix.os }}
permissions:
contents: read
strategy: strategy:
fail-fast: false fail-fast: false
matrix: matrix:
os: [ubuntu-latest, windows-latest, macos-latest] os: [ubuntu-latest]
python-version: ["3.13", "3.14"] python-version: ["3.10", "3.11", "3.12", "3.13", "3.14"]
include: include:
- os: ubuntu-latest - os: windows-latest
python-version: "3.10" python-version: "3.13"
- os: ubuntu-latest - os: macos-latest
python-version: "3.11" python-version: "3.13"
- os: ubuntu-latest - os: windows-11-arm
python-version: "3.12" python-version: "3.13"
- os: macos-15-intel
python-version: "3.13"
- os: windows-latest
python-version: "3.14"
steps: steps:
- name: Checkout code - name: Checkout code

View File

@@ -10,7 +10,6 @@ permissions:
id-token: write id-token: write
attestations: write attestations: write
jobs: jobs:
linux: linux:
runs-on: ubuntu-latest runs-on: ubuntu-latest
@@ -37,11 +36,10 @@ jobs:
subject-path: ./wtsexporter_linux_x64 subject-path: ./wtsexporter_linux_x64
- uses: actions/upload-artifact@v6 - uses: actions/upload-artifact@v6
with: with:
name: binary-linux name: binary-linux-x64
path: | path: ./wtsexporter_linux_x64
./wtsexporter_linux_x64
windows: windows-x64:
runs-on: windows-latest runs-on: windows-latest
steps: steps:
- uses: actions/checkout@v6 - uses: actions/checkout@v6
@@ -57,19 +55,45 @@ jobs:
- name: Build binary with Nuitka - name: Build binary with Nuitka
run: | run: |
python -m nuitka --onefile --include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html --assume-yes-for-downloads Whatsapp_Chat_Exporter --output-filename=wtsexporter python -m nuitka --onefile --include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html --assume-yes-for-downloads Whatsapp_Chat_Exporter --output-filename=wtsexporter
Rename-Item -Path "wtsexporter.exe" -NewName "wtsexporter_x64.exe" Rename-Item -Path "wtsexporter.exe" -NewName "wtsexporter_win_x64.exe"
Get-FileHash wtsexporter_x64.exe Get-FileHash wtsexporter_win_x64.exe
- name: Generate artifact attestation - name: Generate artifact attestation
uses: actions/attest-build-provenance@v3 uses: actions/attest-build-provenance@v3
with: with:
subject-path: .\wtsexporter_x64.exe subject-path: .\wtsexporter_win_x64.exe
- uses: actions/upload-artifact@v6 - uses: actions/upload-artifact@v6
with: with:
name: binary-windows name: binary-windows-x64
path: | path: .\wtsexporter_win_x64.exe
.\wtsexporter_x64.exe
macos: windows-arm:
runs-on: windows-11-arm
steps:
- uses: actions/checkout@v6
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pycryptodome javaobj-py3 ordered-set zstandard nuitka==2.8.9
pip install .
- name: Build binary with Nuitka
run: |
python -m nuitka --onefile --include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html --assume-yes-for-downloads Whatsapp_Chat_Exporter --output-filename=wtsexporter
Rename-Item -Path "wtsexporter.exe" -NewName "wtsexporter_win_arm64.exe"
Get-FileHash wtsexporter_win_arm64.exe
- name: Generate artifact attestation
uses: actions/attest-build-provenance@v3
with:
subject-path: .\wtsexporter_win_arm64.exe
- uses: actions/upload-artifact@v6
with:
name: binary-windows-arm64
path: .\wtsexporter_win_arm64.exe
macos-arm:
runs-on: macos-latest runs-on: macos-latest
steps: steps:
- uses: actions/checkout@v6 - uses: actions/checkout@v6
@@ -86,7 +110,8 @@ jobs:
run: | run: |
python -m nuitka --onefile \ python -m nuitka --onefile \
--include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html \ --include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html \
--assume-yes-for-downloads Whatsapp_Chat_Exporter --output-filename=wtsexporter_macos_arm64 --assume-yes-for-downloads Whatsapp_Chat_Exporter --output-filename=wtsexporter
mv wtsexporter wtsexporter_macos_arm64
shasum -a 256 wtsexporter_macos_arm64 shasum -a 256 wtsexporter_macos_arm64
- name: Generate artifact attestation - name: Generate artifact attestation
uses: actions/attest-build-provenance@v3 uses: actions/attest-build-provenance@v3
@@ -94,7 +119,34 @@ jobs:
subject-path: ./wtsexporter_macos_arm64 subject-path: ./wtsexporter_macos_arm64
- uses: actions/upload-artifact@v6 - uses: actions/upload-artifact@v6
with: with:
name: binary-macos name: binary-macos-arm64
path: | path: ./wtsexporter_macos_arm64
./wtsexporter_macos_arm64
macos-intel:
runs-on: macos-15-intel
steps:
- uses: actions/checkout@v6
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pycryptodome javaobj-py3 ordered-set zstandard nuitka==2.8.9
pip install .
- name: Build binary with Nuitka
run: |
python -m nuitka --onefile \
--include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html \
--assume-yes-for-downloads Whatsapp_Chat_Exporter --output-filename=wtsexporter
mv wtsexporter wtsexporter_macos_x64
shasum -a 256 wtsexporter_macos_x64
- name: Generate artifact attestation
uses: actions/attest-build-provenance@v3
with:
subject-path: ./wtsexporter_macos_x64
- uses: actions/upload-artifact@v6
with:
name: binary-macos-x64
path: ./wtsexporter_macos_x64

View File

@@ -19,12 +19,12 @@ jobs:
steps: steps:
- name: Checkout repository - name: Checkout repository
uses: actions/checkout@v4 uses: actions/checkout@v6
- name: Set up Node.js - name: Set up Node.js
uses: actions/setup-node@v4 uses: actions/setup-node@v6
with: with:
node-version: '22' node-version: '24'
- name: Install dependencies - name: Install dependencies
run: npm install marked fs-extra marked-alert run: npm install marked fs-extra marked-alert
@@ -36,7 +36,7 @@ jobs:
- name: Deploy to gh-pages - name: Deploy to gh-pages
if: github.ref == 'refs/heads/main' # Ensure deployment only happens from main if: github.ref == 'refs/heads/main' # Ensure deployment only happens from main
uses: peaceiris/actions-gh-pages@v4 uses: peaceiris/actions-gh-pages@4f9cc6602d3f66b9c108549d475ec49e8ef4d45e
with: with:
github_token: ${{ secrets.GITHUB_TOKEN }} github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./docs publish_dir: ./docs

View File

@@ -115,7 +115,7 @@ Do an iPhone/iPad Backup with iTunes/Finder first.
If you want to work on an encrypted iOS/iPadOS Backup, you should install iphone_backup_decrypt from [KnugiHK/iphone_backup_decrypt](https://github.com/KnugiHK/iphone_backup_decrypt) before you run the extract_iphone_media.py. If you want to work on an encrypted iOS/iPadOS Backup, you should install iphone_backup_decrypt from [KnugiHK/iphone_backup_decrypt](https://github.com/KnugiHK/iphone_backup_decrypt) before you run the extract_iphone_media.py.
```sh ```sh
pip install git+https://github.com/KnugiHK/iphone_backup_decrypt pip install whatsapp-chat-exporter["ios_backup"]
``` ```
> [!NOTE] > [!NOTE]
> You will need to disable the built-in end-to-end encryption for WhatsApp backups. See [WhatsApp's FAQ](https://faq.whatsapp.com/490592613091019#turn-off-end-to-end-encrypted-backup) for how to do it. > You will need to disable the built-in end-to-end encryption for WhatsApp backups. See [WhatsApp's FAQ](https://faq.whatsapp.com/490592613091019#turn-off-end-to-end-encrypted-backup) for how to do it.
@@ -145,22 +145,24 @@ After extracting, you will get this:
Invoke the wtsexporter with --help option will show you all options available. Invoke the wtsexporter with --help option will show you all options available.
```sh ```sh
> wtsexporter --help > wtsexporter --help
usage: wtsexporter [-h] [-a] [-i] [-e EXPORTED] [-w WA] [-m MEDIA] [-b BACKUP] [-d DB] [-k [KEY]] usage: wtsexporter [-h] [--debug] [-a] [-i] [-e EXPORTED] [-w WA] [-m MEDIA] [-b BACKUP] [-d DB] [-k [KEY]]
[--call-db [CALL_DB_IOS]] [--wab WAB] [-o OUTPUT] [-j [JSON]] [--txt [TEXT_FORMAT]] [--no-html] [--call-db [CALL_DB_IOS]] [--wab WAB] [-o OUTPUT] [-j [JSON]] [--txt [TEXT_FORMAT]] [--no-html]
[--size [SIZE]] [--avoid-encoding-json] [--pretty-print-json [PRETTY_PRINT_JSON]] [--per-chat] [--size [SIZE]] [--no-reply] [--avoid-encoding-json] [--pretty-print-json [PRETTY_PRINT_JSON]]
[--import] [-t TEMPLATE] [--offline OFFLINE] [--no-avatar] [--experimental-new-theme] [--tg] [--per-chat] [--import] [-t TEMPLATE] [--offline OFFLINE] [--no-avatar] [--old-theme]
[--headline HEADLINE] [-c] [--create-separated-media] [--time-offset {-12 to 14}] [--date DATE] [--headline HEADLINE] [-c] [--create-separated-media] [--time-offset {-12 to 14}] [--date DATE]
[--date-format FORMAT] [--include [phone number ...]] [--exclude [phone number ...]] [--date-format FORMAT] [--include [phone number ...]] [--exclude [phone number ...]]
[--dont-filter-empty] [--enrich-from-vcards ENRICH_FROM_VCARDS] [--dont-filter-empty] [--enrich-from-vcards ENRICH_FROM_VCARDS]
[--default-country-code DEFAULT_COUNTRY_CODE] [-s] [--check-update] [--assume-first-as-me] [--default-country-code DEFAULT_COUNTRY_CODE] [--incremental-merge] [--source-dir SOURCE_DIR]
[--business] [--decrypt-chunk-size DECRYPT_CHUNK_SIZE] [--target-dir TARGET_DIR] [-s] [--check-update] [--assume-first-as-me] [--business]
[--max-bruteforce-worker MAX_BRUTEFORCE_WORKER] [--decrypt-chunk-size DECRYPT_CHUNK_SIZE] [--max-bruteforce-worker MAX_BRUTEFORCE_WORKER]
[--no-banner]
A customizable Android and iOS/iPadOS WhatsApp database parser that will give you the history of your WhatsApp A customizable Android and iOS/iPadOS WhatsApp database parser that will give you the history of your WhatsApp
conversations in HTML and JSON. Android Backup Crypt12, Crypt14 and Crypt15 supported. conversations in HTML and JSON. Android Backup Crypt12, Crypt14 and Crypt15 supported.
options: options:
-h, --help show this help message and exit -h, --help show this help message and exit
--debug Enable debug mode
Device Type: Device Type:
-a, --android Define the target as Android -a, --android Define the target as Android
@@ -188,12 +190,14 @@ Output Options:
--no-html Do not output html files --no-html Do not output html files
--size, --output-size, --split [SIZE] --size, --output-size, --split [SIZE]
Maximum (rough) size of a single output file in bytes, 0 for auto Maximum (rough) size of a single output file in bytes, 0 for auto
--no-reply Do not process replies (iOS only) (default: handle replies)
JSON Options: JSON Options:
--avoid-encoding-json --avoid-encoding-json
Don't encode non-ascii characters in the output JSON files Don't encode non-ascii characters in the output JSON files
--pretty-print-json [PRETTY_PRINT_JSON] --pretty-print-json [PRETTY_PRINT_JSON]
Pretty print the output JSON. Pretty print the output JSON.
--tg, --telegram Output the JSON in a format compatible with Telegram export (implies json-per-chat)
--per-chat Output the JSON file per chat --per-chat Output the JSON file per chat
--import Import JSON file and convert to HTML output --import Import JSON file and convert to HTML output
@@ -202,8 +206,7 @@ HTML Options:
Path to custom HTML template Path to custom HTML template
--offline OFFLINE Relative path to offline static files --offline OFFLINE Relative path to offline static files
--no-avatar Do not render avatar in HTML output --no-avatar Do not render avatar in HTML output
--experimental-new-theme --old-theme Use the old Telegram-alike theme
Use the newly designed WhatsApp-alike theme
--headline HEADLINE The custom headline for the HTML output. Use '??' as a placeholder for the chat name --headline HEADLINE The custom headline for the HTML output. Use '??' as a placeholder for the chat name
Media Handling: Media Handling:
@@ -232,12 +235,11 @@ Contact Enrichment:
will be used. 1 is for US, 66 for Thailand etc. Most likely use the number of your own country will be used. 1 is for US, 66 for Thailand etc. Most likely use the number of your own country
Incremental Merging: Incremental Merging:
--incremental-merge Performs an incremental merge of two exports. Requires setting both --source- --incremental-merge Performs an incremental merge of two exports. Requires setting both --source-dir and --target-
dir and --target-dir. The chats (JSON files only) and media from the source dir. The chats (JSON files only) and media from the source directory will be merged into the
directory will be merged into the target directory. No chat messages or media target directory. No chat messages or media will be deleted from the target directory; only
will be deleted from the target directory; only new chat messages and media new chat messages and media will be added to it. This enables chat messages and media to be
will be added to it. This enables chat messages and media to be deleted from deleted from the device to free up space, while ensuring they are preserved in the exported
the device to free up space, while ensuring they are preserved in the exported
backups. backups.
--source-dir SOURCE_DIR --source-dir SOURCE_DIR
Sets the source directory. Used for performing incremental merges. Sets the source directory. Used for performing incremental merges.
@@ -253,15 +255,37 @@ Miscellaneous:
Specify the chunk size for decrypting iOS backup, which may affect the decryption speed. Specify the chunk size for decrypting iOS backup, which may affect the decryption speed.
--max-bruteforce-worker MAX_BRUTEFORCE_WORKER --max-bruteforce-worker MAX_BRUTEFORCE_WORKER
Specify the maximum number of worker for bruteforce decryption. Specify the maximum number of worker for bruteforce decryption.
--no-banner Do not show the banner
WhatsApp Chat Exporter: 0.13.0rc1 Licensed with MIT. See https://wts.knugi.dev/docs?dest=osl for all open source WhatsApp Chat Exporter: 0.13.0rc2 Licensed with MIT. See https://wts.knugi.dev/docs?dest=osl for all open source
licenses. licenses.
``` ```
# Verifying Build Integrity
To ensure that the binaries provided in the releases were built directly from this source code via GitHub Actions and have not been tampered with, GitHub Artifact Attestations is used. You can verify the authenticity of any pre-built binaries using the GitHub CLI.
> [!NOTE]
> Requires version 0.13.0rc1 or newer. Legacy binaries are unsupported.
### Using Bash (Linux/WSL/macOS)
```bash
for file in wtsexporter*; do ; gh attestation verify "$file" -R KnugiHK/WhatsApp-Chat-Exporter; done
```
### Using PowerShell (Windows)
```powershell
gci "wtsexporter*" | % { gh attestation verify $_.FullName -R KnugiHK/WhatsApp-Chat-Exporter }
```
# Python Support Policy # Python Support Policy
This project officially supports all non-EOL (End-of-Life) versions of Python. Once a Python version reaches EOL, it is dropped in the next release. See [Python's EOL Schedule](https://devguide.python.org/versions/). This project officially supports all non-EOL (End-of-Life) versions of Python. Once a Python version reaches EOL, it is dropped in the next release. See [Python's EOL Schedule](https://devguide.python.org/versions/).
# Legal Stuff & Disclaimer # Legal Stuff & Disclaimer
This is a MIT licensed project. This is a MIT licensed project.

View File

@@ -11,14 +11,16 @@ import logging
import importlib.metadata import importlib.metadata
from Whatsapp_Chat_Exporter import android_crypt, exported_handler, android_handler from Whatsapp_Chat_Exporter import android_crypt, exported_handler, android_handler
from Whatsapp_Chat_Exporter import ios_handler, ios_media_handler from Whatsapp_Chat_Exporter import ios_handler, ios_media_handler
from Whatsapp_Chat_Exporter.data_model import ChatCollection, ChatStore from Whatsapp_Chat_Exporter.data_model import ChatCollection, ChatStore, Timing
from Whatsapp_Chat_Exporter.utility import APPLE_TIME, CLEAR_LINE, Crypt, check_update from Whatsapp_Chat_Exporter.utility import APPLE_TIME, CLEAR_LINE, CURRENT_TZ_OFFSET, Crypt
from Whatsapp_Chat_Exporter.utility import readable_to_bytes, safe_name, bytes_to_readable from Whatsapp_Chat_Exporter.utility import readable_to_bytes, safe_name, bytes_to_readable
from Whatsapp_Chat_Exporter.utility import import_from_json, incremental_merge, DbType from Whatsapp_Chat_Exporter.utility import import_from_json, incremental_merge, check_update
from Whatsapp_Chat_Exporter.utility import telegram_json_format from Whatsapp_Chat_Exporter.utility import telegram_json_format, convert_time_unit, DbType
from Whatsapp_Chat_Exporter.utility import get_transcription_selection, check_jid_map
from argparse import ArgumentParser, SUPPRESS from argparse import ArgumentParser, SUPPRESS
from datetime import datetime from datetime import datetime
from getpass import getpass from getpass import getpass
from tqdm import tqdm
from sys import exit from sys import exit
from typing import Optional, List, Dict from typing import Optional, List, Dict
from Whatsapp_Chat_Exporter.vcards_contacts import ContactsFromVCards from Whatsapp_Chat_Exporter.vcards_contacts import ContactsFromVCards
@@ -42,7 +44,7 @@ WTSEXPORTER_BANNER = f"""=======================================================
╚═════╝╚═╝ ╚═╝╚═╝ ╚═╝ ╚═╝ ╚══════╝╚═╝ ╚═╝╚═╝ ╚═════╝ ╚═╝ ╚═╝ ╚═╝ ╚══════╝╚═╝ ╚═╝ ╚═════╝╚═╝ ╚═╝╚═╝ ╚═╝ ╚═╝ ╚══════╝╚═╝ ╚═╝╚═╝ ╚═════╝ ╚═╝ ╚═╝ ╚═╝ ╚══════╝╚═╝ ╚═╝
WhatsApp Chat Exporter: A customizable Android and iOS/iPadOS WhatsApp database parser WhatsApp Chat Exporter: A customizable Android and iOS/iPadOS WhatsApp database parser
Version: {__version__} {f"Version: {__version__}".center(104)}
========================================================================================================""" ========================================================================================================"""
@@ -286,13 +288,17 @@ def setup_argument_parser() -> ArgumentParser:
help="Specify the chunk size for decrypting iOS backup, which may affect the decryption speed." help="Specify the chunk size for decrypting iOS backup, which may affect the decryption speed."
) )
misc_group.add_argument( misc_group.add_argument(
"--max-bruteforce-worker", dest="max_bruteforce_worker", default=10, type=int, "--max-bruteforce-worker", dest="max_bruteforce_worker", default=4, type=int,
help="Specify the maximum number of worker for bruteforce decryption." help="Specify the maximum number of worker for bruteforce decryption."
) )
misc_group.add_argument( misc_group.add_argument(
"--no-banner", dest="no_banner", default=False, action='store_true', "--no-banner", dest="no_banner", default=False, action='store_true',
help="Do not show the banner" help="Do not show the banner"
) )
misc_group.add_argument(
"--fix-dot-files", dest="fix_dot_files", default=False, action='store_true',
help="Fix files with a dot at the end of their name (allowing the outputs be stored in FAT filesystems)"
)
return parser return parser
@@ -519,6 +525,7 @@ def process_contacts(args, data: ChatCollection) -> None:
if os.path.isfile(contact_db): if os.path.isfile(contact_db):
with sqlite3.connect(contact_db) as db: with sqlite3.connect(contact_db) as db:
db.row_factory = sqlite3.Row db.row_factory = sqlite3.Row
db.text_factory = lambda b: b.decode(encoding="utf-8", errors="replace")
if args.android: if args.android:
android_handler.contacts(db, data, args.enrich_from_vcards) android_handler.contacts(db, data, args.enrich_from_vcards)
else: else:
@@ -537,25 +544,29 @@ def process_messages(args, data: ChatCollection) -> None:
exit(6) exit(6)
filter_chat = (args.filter_chat_include, args.filter_chat_exclude) filter_chat = (args.filter_chat_include, args.filter_chat_exclude)
timing = Timing(args.timezone_offset if args.timezone_offset else CURRENT_TZ_OFFSET)
with sqlite3.connect(msg_db) as db: with sqlite3.connect(msg_db) as db:
db.row_factory = sqlite3.Row db.row_factory = sqlite3.Row
db.text_factory = lambda b: b.decode(encoding="utf-8", errors="replace")
# Process messages # Process messages
if args.android: if args.android:
message_handler = android_handler message_handler = android_handler
data.set_system("jid_map_exists", check_jid_map(db))
data.set_system("transcription_selection", get_transcription_selection(db))
else: else:
message_handler = ios_handler message_handler = ios_handler
message_handler.messages( message_handler.messages(
db, data, args.media, args.timezone_offset, args.filter_date, db, data, args.media, timing, args.filter_date,
filter_chat, args.filter_empty, args.no_reply_ios filter_chat, args.filter_empty, args.no_reply_ios
) )
# Process media # Process media
message_handler.media( message_handler.media(
db, data, args.media, args.filter_date, db, data, args.media, args.filter_date,
filter_chat, args.filter_empty, args.separate_media filter_chat, args.filter_empty, args.separate_media, args.fix_dot_files
) )
# Process vcards # Process vcards
@@ -565,17 +576,18 @@ def process_messages(args, data: ChatCollection) -> None:
) )
# Process calls # Process calls
process_calls(args, db, data, filter_chat) process_calls(args, db, data, filter_chat, timing)
def process_calls(args, db, data: ChatCollection, filter_chat) -> None: def process_calls(args, db, data: ChatCollection, filter_chat, timing) -> None:
"""Process call history if available.""" """Process call history if available."""
if args.android: if args.android:
android_handler.calls(db, data, args.timezone_offset, filter_chat) android_handler.calls(db, data, timing, filter_chat)
elif args.ios and args.call_db_ios is not None: elif args.ios and args.call_db_ios is not None:
with sqlite3.connect(args.call_db_ios) as cdb: with sqlite3.connect(args.call_db_ios) as cdb:
cdb.row_factory = sqlite3.Row cdb.row_factory = sqlite3.Row
ios_handler.calls(cdb, data, args.timezone_offset, filter_chat) cdb.text_factory = lambda b: b.decode(encoding="utf-8", errors="replace")
ios_handler.calls(cdb, data, timing, filter_chat)
def handle_media_directory(args) -> None: def handle_media_directory(args) -> None:
@@ -665,24 +677,27 @@ def export_multiple_json(args, data: Dict) -> None:
# Export each chat # Export each chat
total = len(data.keys()) total = len(data.keys())
for index, jik in enumerate(data.keys()): with tqdm(total=total, desc="Generating JSON files", unit="file", leave=False) as pbar:
if data[jik]["name"] is not None: for jik in data.keys():
contact = data[jik]["name"].replace('/', '') if data[jik]["name"] is not None:
else: contact = data[jik]["name"].replace('/', '')
contact = jik.replace('+', '') else:
contact = jik.replace('+', '')
if args.telegram: if args.telegram:
messages = telegram_json_format(jik, data[jik], args.timezone_offset) messages = telegram_json_format(jik, data[jik], args.timezone_offset)
else: else:
messages = {jik: data[jik]} messages = {jik: data[jik]}
with open(f"{json_path}/{safe_name(contact)}.json", "w") as f: with open(f"{json_path}/{safe_name(contact)}.json", "w") as f:
file_content = json.dumps( file_content = json.dumps(
messages, messages,
ensure_ascii=not args.avoid_encoding_json, ensure_ascii=not args.avoid_encoding_json,
indent=args.pretty_print_json indent=args.pretty_print_json
) )
f.write(file_content) f.write(file_content)
logger.info(f"Writing JSON file...({index + 1}/{total})\r") pbar.update(1)
total_time = pbar.format_dict['elapsed']
logger.info(f"Generated {total} JSON files in {convert_time_unit(total_time)}{CLEAR_LINE}")
def process_exported_chat(args, data: ChatCollection) -> None: def process_exported_chat(args, data: ChatCollection) -> None:

View File

@@ -1,13 +1,12 @@
import time
import hmac import hmac
import io import io
import logging import logging
import threading
import zlib import zlib
import concurrent.futures import concurrent.futures
from tqdm import tqdm
from typing import Tuple, Union from typing import Tuple, Union
from hashlib import sha256 from hashlib import sha256
from sys import exit from functools import partial
from Whatsapp_Chat_Exporter.utility import CLEAR_LINE, CRYPT14_OFFSETS, Crypt, DbType from Whatsapp_Chat_Exporter.utility import CLEAR_LINE, CRYPT14_OFFSETS, Crypt, DbType
try: try:
@@ -112,13 +111,36 @@ def _decrypt_database(db_ciphertext: bytes, main_key: bytes, iv: bytes) -> bytes
zlib.error: If decompression fails. zlib.error: If decompression fails.
ValueError: if the plaintext is not a SQLite database. ValueError: if the plaintext is not a SQLite database.
""" """
FOOTER_SIZE = 32
if len(db_ciphertext) <= FOOTER_SIZE:
raise ValueError("Input data too short to contain a valid GCM tag.")
actual_ciphertext = db_ciphertext[:-FOOTER_SIZE]
tag = db_ciphertext[-FOOTER_SIZE: -FOOTER_SIZE + 16]
cipher = AES.new(main_key, AES.MODE_GCM, iv) cipher = AES.new(main_key, AES.MODE_GCM, iv)
db_compressed = cipher.decrypt(db_ciphertext) try:
db = zlib.decompress(db_compressed) db_compressed = cipher.decrypt_and_verify(actual_ciphertext, tag)
if db[0:6].upper() != b"SQLITE": except ValueError:
# This could be key, IV, or tag is wrong, but likely the key is wrong.
raise ValueError("Decryption/Authentication failed. Ensure you are using the correct key.")
if len(db_compressed) < 2 or db_compressed[0] != 0x78:
logger.debug(f"Data passes GCM but is not Zlib. Header: {db_compressed[:2].hex()}")
raise ValueError( raise ValueError(
"The plaintext is not a SQLite database. Ensure you are using the correct key." "Key is correct, but decrypted data is not a valid compressed stream. "
"Is this even a valid WhatsApp database backup?"
) )
try:
db = zlib.decompress(db_compressed)
except zlib.error as e:
raise zlib.error(f"Decompression failed (The backup file likely corrupted at source): {e}")
if not db.startswith(b"SQLite"):
raise ValueError(
"Data is valid and decompressed, but it is not a SQLite database. "
"Is this even a valid WhatsApp database backup?")
return db return db
@@ -142,82 +164,69 @@ def _decrypt_crypt14(database: bytes, main_key: bytes, max_worker: int = 10) ->
# Attempt known offsets first # Attempt known offsets first
for offsets in CRYPT14_OFFSETS: for offsets in CRYPT14_OFFSETS:
iv = database[offsets["iv"]:offsets["iv"] + 16] iv = offsets["iv"]
db_ciphertext = database[offsets["db"]:] db = offsets["db"]
try: try:
decrypted_db = _decrypt_database(db_ciphertext, main_key, iv) decrypted_db = _attempt_decrypt_task((iv, iv + 16, db), database, main_key)
except (zlib.error, ValueError): except (zlib.error, ValueError):
pass # Try next offset continue
else: else:
logger.debug( logger.debug(
f"Decryption successful with known offsets: IV {offsets['iv']}, DB {offsets['db']}{CLEAR_LINE}" f"Decryption successful with known offsets: IV {iv}, DB {db}{CLEAR_LINE}"
) )
return decrypted_db # Successful decryption return decrypted_db # Successful decryption
def animate_message(stop_event): logger.info(f"Common offsets failed. Will attempt to brute-force{CLEAR_LINE}")
base_msg = "Common offsets failed. Initiating brute-force with multithreading" offset_max = 200
dots = ["", ".", "..", "..."] workers = max_worker
i = 0 check_offset = partial(_attempt_decrypt_task, database=database, main_key=main_key)
while not stop_event.is_set(): all_offsets = list(brute_force_offset(offset_max, offset_max))
logger.info(f"{base_msg}{dots[i % len(dots)]}\x1b[K\r") executor = concurrent.futures.ProcessPoolExecutor(max_workers=workers)
time.sleep(0.3) try:
i += 1 with tqdm(total=len(all_offsets), desc="Brute-forcing offsets", unit="trial", leave=False) as pbar:
logger.info(f"Common offsets failed but brute-forcing the offset works!{CLEAR_LINE}") results = executor.map(check_offset, all_offsets, chunksize=8)
found = False
stop_event = threading.Event() for offset_info, result in zip(all_offsets, results):
anim_thread = threading.Thread(target=animate_message, args=(stop_event,)) pbar.update(1)
anim_thread.start() if result:
start_iv, _, start_db = offset_info
# Convert brute force generator into a list for parallel processing # Clean shutdown on success
offset_combinations = list(brute_force_offset())
def attempt_decrypt(offset_tuple):
"""Attempt decryption with the given offsets."""
start_iv, end_iv, start_db = offset_tuple
iv = database[start_iv:end_iv]
db_ciphertext = database[start_db:]
logger.debug(""f"Trying offsets: IV {start_iv}-{end_iv}, DB {start_db}{CLEAR_LINE}")
try:
db = _decrypt_database(db_ciphertext, main_key, iv)
except (zlib.error, ValueError):
return None # Decryption failed, move to next
else:
stop_event.set()
anim_thread.join()
logger.info(
f"The offsets of your IV and database are {start_iv} and "
f"{start_db}, respectively. To include your offsets in the "
"program, please report it by creating an issue on GitHub: "
"https://github.com/KnugiHK/Whatsapp-Chat-Exporter/discussions/47"
f"\nShutting down other threads...{CLEAR_LINE}"
)
return db
with concurrent.futures.ThreadPoolExecutor(max_worker) as executor:
future_to_offset = {executor.submit(attempt_decrypt, offset)
: offset for offset in offset_combinations}
try:
for future in concurrent.futures.as_completed(future_to_offset):
result = future.result()
if result is not None:
# Shutdown remaining threads
executor.shutdown(wait=False, cancel_futures=True) executor.shutdown(wait=False, cancel_futures=True)
return result found = True
break
if found:
logger.info(
f"The offsets of your IV and database are {start_iv} and {start_db}, respectively.{CLEAR_LINE}"
)
logger.info(
f"To include your offsets in the expoter, please report it in the discussion thread on GitHub:{CLEAR_LINE}"
)
logger.info(f"https://github.com/KnugiHK/Whatsapp-Chat-Exporter/discussions/47{CLEAR_LINE}")
return result
except KeyboardInterrupt: except KeyboardInterrupt:
stop_event.set() executor.shutdown(wait=False, cancel_futures=True)
anim_thread.join() print("\n")
logger.info(f"Brute force interrupted by user (Ctrl+C). Shutting down gracefully...{CLEAR_LINE}") raise KeyboardInterrupt(
executor.shutdown(wait=False, cancel_futures=True) f"Brute force interrupted by user (Ctrl+C). Shutting down gracefully...{CLEAR_LINE}"
exit(1) )
finally:
stop_event.set() finally:
anim_thread.join() executor.shutdown(wait=False)
raise OffsetNotFoundError("Could not find the correct offsets for decryption.") raise OffsetNotFoundError("Could not find the correct offsets for decryption.")
def _attempt_decrypt_task(offset_tuple, database, main_key):
"""Attempt decryption with the given offsets."""
start_iv, end_iv, start_db = offset_tuple
iv = database[start_iv:end_iv]
db_ciphertext = database[start_db:]
try:
return _decrypt_database(db_ciphertext, main_key, iv)
except (zlib.error, ValueError):
return None
def _decrypt_crypt12(database: bytes, main_key: bytes) -> bytes: def _decrypt_crypt12(database: bytes, main_key: bytes) -> bytes:
"""Decrypt a crypt12 database. """Decrypt a crypt12 database.

View File

@@ -4,16 +4,17 @@ import logging
import sqlite3 import sqlite3
import os import os
import shutil import shutil
from tqdm import tqdm
from pathlib import Path from pathlib import Path
from mimetypes import MimeTypes from mimetypes import MimeTypes
from markupsafe import escape as htmle from markupsafe import escape as htmle
from base64 import b64decode, b64encode from base64 import b64decode, b64encode
from datetime import datetime from datetime import datetime
from Whatsapp_Chat_Exporter.data_model import ChatStore, Message from Whatsapp_Chat_Exporter.data_model import ChatStore, Message
from Whatsapp_Chat_Exporter.utility import CLEAR_LINE, CURRENT_TZ_OFFSET, MAX_SIZE, ROW_SIZE, JidType, Device from Whatsapp_Chat_Exporter.utility import CLEAR_LINE, MAX_SIZE, ROW_SIZE, JidType, Device, get_jid_map_join
from Whatsapp_Chat_Exporter.utility import rendering, get_file_name, setup_template, get_cond_for_empty from Whatsapp_Chat_Exporter.utility import rendering, get_file_name, setup_template, get_cond_for_empty
from Whatsapp_Chat_Exporter.utility import get_status_location, convert_time_unit, determine_metadata from Whatsapp_Chat_Exporter.utility import get_status_location, convert_time_unit, get_jid_map_selection
from Whatsapp_Chat_Exporter.utility import get_chat_condition, safe_name, bytes_to_readable from Whatsapp_Chat_Exporter.utility import get_chat_condition, safe_name, bytes_to_readable, determine_metadata
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -38,21 +39,24 @@ def contacts(db, data, enrich_from_vcards):
if total_row_number == 0: if total_row_number == 0:
if enrich_from_vcards is not None: if enrich_from_vcards is not None:
logger.info( logger.info(
"No contacts profiles found in the default database, contacts will be imported from the specified vCard file.") "No contacts profiles found in the default database, contacts will be imported from the specified vCard file.\n")
else: else:
logger.warning( logger.warning(
"No contacts profiles found in the default database, consider using --enrich-from-vcards for adopting names from exported contacts from Google") "No contacts profiles found in the default database, consider using --enrich-from-vcards for adopting names from exported contacts from Google\n")
return False return False
else: else:
logger.info(f"Processed {total_row_number} contacts\n") logger.info(f"Processed {total_row_number} contacts\n")
c.execute("SELECT jid, COALESCE(display_name, wa_name) as display_name, status FROM wa_contacts;") c.execute("SELECT jid, COALESCE(display_name, wa_name) as display_name, status FROM wa_contacts;")
row = c.fetchone()
while row is not None: with tqdm(total=total_row_number, desc="Processing contacts", unit="contact", leave=False) as pbar:
current_chat = data.add_chat(row["jid"], ChatStore(Device.ANDROID, row["display_name"])) while (row := _fetch_row_safely(c)) is not None:
if row["status"] is not None: current_chat = data.add_chat(row["jid"], ChatStore(Device.ANDROID, row["display_name"]))
current_chat.status = row["status"] if row["status"] is not None:
row = c.fetchone() current_chat.status = row["status"]
pbar.update(1)
total_time = pbar.format_dict['elapsed']
logger.info(f"Processed {total_row_number} contacts in {convert_time_unit(total_time)}{CLEAR_LINE}")
return True return True
@@ -71,39 +75,37 @@ def messages(db, data, media_folder, timezone_offset, filter_date, filter_chat,
filter_empty: Filter for empty chats filter_empty: Filter for empty chats
""" """
c = db.cursor() c = db.cursor()
total_row_number = _get_message_count(c, filter_empty, filter_date, filter_chat) total_row_number = _get_message_count(c, filter_empty, filter_date, filter_chat, data.get_system("jid_map_exists"))
logger.info(f"Processing messages...(0/{total_row_number})\r")
try: try:
content_cursor = _get_messages_cursor_legacy(c, filter_empty, filter_date, filter_chat) content_cursor = _get_messages_cursor_legacy(c, filter_empty, filter_date, filter_chat)
table_message = False table_message = False
except sqlite3.OperationalError: except sqlite3.OperationalError as e:
logger.debug(f'Got sql error "{e}" in _get_message_cursor_legacy trying fallback.\n')
try: try:
content_cursor = _get_messages_cursor_new(c, filter_empty, filter_date, filter_chat) content_cursor = _get_messages_cursor_new(
c,
filter_empty,
filter_date,
filter_chat,
data.get_system("transcription_selection"),
data.get_system("jid_map_exists")
)
table_message = True table_message = True
except Exception as e: except Exception as e:
raise e raise e
i = 0 with tqdm(total=total_row_number, desc="Processing messages", unit="msg", leave=False) as pbar:
# Fetch the first row safely while (content := _fetch_row_safely(content_cursor)) is not None:
content = _fetch_row_safely(content_cursor) _process_single_message(data, content, table_message, timezone_offset)
pbar.update(1)
while content is not None: total_time = pbar.format_dict['elapsed']
_process_single_message(data, content, table_message, timezone_offset) _get_reactions(db, data)
logger.info(f"Processed {total_row_number} messages in {convert_time_unit(total_time)}{CLEAR_LINE}")
i += 1
if i % 1000 == 0:
logger.info(f"Processing messages...({i}/{total_row_number})\r")
# Fetch the next row safely
content = _fetch_row_safely(content_cursor)
logger.info(f"Processed {total_row_number} messages{CLEAR_LINE}")
# Helper functions for message processing # Helper functions for message processing
def _get_message_count(cursor, filter_empty, filter_date, filter_chat): def _get_message_count(cursor, filter_empty, filter_date, filter_chat, jid_map_exists):
"""Get the total number of messages to process.""" """Get the total number of messages to process."""
try: try:
empty_filter = get_cond_for_empty(filter_empty, "messages.key_remote_jid", "messages.needs_push") empty_filter = get_cond_for_empty(filter_empty, "messages.key_remote_jid", "messages.needs_push")
@@ -124,22 +126,28 @@ def _get_message_count(cursor, filter_empty, filter_date, filter_chat):
{date_filter} {date_filter}
{include_filter} {include_filter}
{exclude_filter}""") {exclude_filter}""")
except sqlite3.OperationalError: except sqlite3.OperationalError as e:
empty_filter = get_cond_for_empty(filter_empty, "jid.raw_string", "broadcast") logger.debug(f'Got sql error "{e}" in _get_message_count trying fallback.\n')
date_filter = f'AND timestamp {filter_date}' if filter_date is not None else ''
include_filter = get_chat_condition(
filter_chat[0], True, ["jid.raw_string", "jid_group.raw_string"], "jid", "android")
exclude_filter = get_chat_condition(
filter_chat[1], False, ["jid.raw_string", "jid_group.raw_string"], "jid", "android")
cursor.execute(f"""SELECT count() empty_filter = get_cond_for_empty(filter_empty, "key_remote_jid", "broadcast")
date_filter = f'AND timestamp {filter_date}' if filter_date is not None else ''
remote_jid_selection, group_jid_selection = get_jid_map_selection(jid_map_exists)
include_filter = get_chat_condition(
filter_chat[0], True, ["key_remote_jid", "group_sender_jid"], "jid", "android")
exclude_filter = get_chat_condition(
filter_chat[1], False, ["key_remote_jid", "group_sender_jid"], "jid", "android")
cursor.execute(f"""SELECT count(),
{remote_jid_selection} as key_remote_jid,
{group_jid_selection} as group_sender_jid
FROM message FROM message
LEFT JOIN chat LEFT JOIN chat
ON chat._id = message.chat_row_id ON chat._id = message.chat_row_id
INNER JOIN jid INNER JOIN jid jid_global
ON jid._id = chat.jid_row_id ON jid_global._id = chat.jid_row_id
LEFT JOIN jid jid_group LEFT JOIN jid jid_group
ON jid_group._id = message.sender_jid_row_id ON jid_group._id = message.sender_jid_row_id
{get_jid_map_join(jid_map_exists)}
WHERE 1=1 WHERE 1=1
{empty_filter} {empty_filter}
{date_filter} {date_filter}
@@ -213,16 +221,24 @@ def _get_messages_cursor_legacy(cursor, filter_empty, filter_date, filter_chat):
return cursor return cursor
def _get_messages_cursor_new(cursor, filter_empty, filter_date, filter_chat): def _get_messages_cursor_new(
cursor,
filter_empty,
filter_date,
filter_chat,
transcription_selection,
jid_map_exists
):
"""Get cursor for new database schema.""" """Get cursor for new database schema."""
empty_filter = get_cond_for_empty(filter_empty, "key_remote_jid", "broadcast") empty_filter = get_cond_for_empty(filter_empty, "key_remote_jid", "broadcast")
date_filter = f'AND message.timestamp {filter_date}' if filter_date is not None else '' date_filter = f'AND message.timestamp {filter_date}' if filter_date is not None else ''
remote_jid_selection, group_jid_selection = get_jid_map_selection(jid_map_exists)
include_filter = get_chat_condition( include_filter = get_chat_condition(
filter_chat[0], True, ["key_remote_jid", "jid_group.raw_string"], "jid_global", "android") filter_chat[0], True, ["key_remote_jid", "group_sender_jid"], "jid_global", "android")
exclude_filter = get_chat_condition( exclude_filter = get_chat_condition(
filter_chat[1], False, ["key_remote_jid", "jid_group.raw_string"], "jid_global", "android") filter_chat[1], False, ["key_remote_jid", "group_sender_jid"], "jid_global", "android")
cursor.execute(f"""SELECT jid_global.raw_string as key_remote_jid, cursor.execute(f"""SELECT {remote_jid_selection} as key_remote_jid,
message._id, message._id,
message.from_me as key_from_me, message.from_me as key_from_me,
message.timestamp, message.timestamp,
@@ -237,7 +253,7 @@ def _get_messages_cursor_new(cursor, filter_empty, filter_date, filter_chat):
message.key_id, message.key_id,
message_quoted.text_data as quoted_data, message_quoted.text_data as quoted_data,
message.message_type as media_wa_type, message.message_type as media_wa_type,
jid_group.raw_string as group_sender_jid, {group_jid_selection} as group_sender_jid,
chat.subject as chat_subject, chat.subject as chat_subject,
missed_call_logs.video_call, missed_call_logs.video_call,
message.sender_jid_row_id, message.sender_jid_row_id,
@@ -247,7 +263,8 @@ def _get_messages_cursor_new(cursor, filter_empty, filter_date, filter_chat):
jid_new.raw_string as new_jid, jid_new.raw_string as new_jid,
jid_global.type as jid_type, jid_global.type as jid_type,
COALESCE(receipt_user.receipt_timestamp, message.received_timestamp) as received_timestamp, COALESCE(receipt_user.receipt_timestamp, message.received_timestamp) as received_timestamp,
COALESCE(receipt_user.read_timestamp, receipt_user.played_timestamp) as read_timestamp COALESCE(receipt_user.read_timestamp, receipt_user.played_timestamp) as read_timestamp,
{transcription_selection}
FROM message FROM message
LEFT JOIN message_quoted LEFT JOIN message_quoted
ON message_quoted.message_row_id = message._id ON message_quoted.message_row_id = message._id
@@ -279,6 +296,7 @@ def _get_messages_cursor_new(cursor, filter_empty, filter_date, filter_chat):
ON jid_new._id = message_system_number_change.new_jid_row_id ON jid_new._id = message_system_number_change.new_jid_row_id
LEFT JOIN receipt_user LEFT JOIN receipt_user
ON receipt_user.message_row_id = message._id ON receipt_user.message_row_id = message._id
{get_jid_map_join(jid_map_exists)}
WHERE key_remote_jid <> '-1' WHERE key_remote_jid <> '-1'
{empty_filter} {empty_filter}
{date_filter} {date_filter}
@@ -294,7 +312,11 @@ def _fetch_row_safely(cursor):
try: try:
content = cursor.fetchone() content = cursor.fetchone()
return content return content
except sqlite3.OperationalError: except sqlite3.OperationalError as e:
# Not sure how often this might happen, but this check should reduce the overhead
# if DEBUG flag is not set.
if logger.isEnabledFor(logging.DEBUG):
logger.debug(f'Got sql error "{e}" in _fetch_row_safely ignoring row.\n')
continue continue
@@ -320,7 +342,7 @@ def _process_single_message(data, content, table_message, timezone_offset):
timestamp=content["timestamp"], timestamp=content["timestamp"],
time=content["timestamp"], time=content["timestamp"],
key_id=content["key_id"], key_id=content["key_id"],
timezone_offset=timezone_offset if timezone_offset else CURRENT_TZ_OFFSET, timezone_offset=timezone_offset,
message_type=content["media_wa_type"], message_type=content["media_wa_type"],
received_timestamp=content["received_timestamp"], received_timestamp=content["received_timestamp"],
read_timestamp=content["read_timestamp"] read_timestamp=content["read_timestamp"]
@@ -352,9 +374,12 @@ def _process_single_message(data, content, table_message, timezone_offset):
if not table_message and content["media_caption"] is not None: if not table_message and content["media_caption"] is not None:
# Old schema # Old schema
message.caption = content["media_caption"] message.caption = content["media_caption"]
elif table_message and content["media_wa_type"] == 1 and content["data"] is not None: elif table_message:
# New schema # New schema
message.caption = content["data"] if content["media_wa_type"] == 1 and content["data"] is not None:
message.caption = content["data"]
elif content["media_wa_type"] == 2 and content["transcription_text"] is not None:
message.caption = f'"{content["transcription_text"]}"'
else: else:
message.caption = None message.caption = None
@@ -480,7 +505,79 @@ def _format_message_text(text):
return text return text
def media(db, data, media_folder, filter_date, filter_chat, filter_empty, separate_media=True): def _get_reactions(db, data):
"""
Process message reactions. Only new schema is supported.
Chat filter is not applied here at the moment. Maybe in the future.
"""
c = db.cursor()
try:
# Check if tables exist, old schema might not have reactions or in somewhere else
c.execute("SELECT count(*) FROM sqlite_master WHERE type='table' AND name='message_add_on'")
if c.fetchone()[0] == 0:
return
logger.info("Processing reactions...\r")
c.execute("""
SELECT
message_add_on.parent_message_row_id,
message_add_on_reaction.reaction,
message_add_on.from_me,
jid.raw_string as sender_jid_raw,
chat_jid.raw_string as chat_jid_raw,
message_add_on_reaction.sender_timestamp
FROM message_add_on
INNER JOIN message_add_on_reaction
ON message_add_on._id = message_add_on_reaction.message_add_on_row_id
LEFT JOIN jid
ON message_add_on.sender_jid_row_id = jid._id
LEFT JOIN chat
ON message_add_on.chat_row_id = chat._id
LEFT JOIN jid chat_jid
ON chat.jid_row_id = chat_jid._id
""")
except sqlite3.OperationalError:
logger.warning(f"Could not fetch reactions (schema might be too old or incompatible){CLEAR_LINE}")
return
rows = c.fetchall()
total_row_number = len(rows)
with tqdm(total=total_row_number, desc="Processing reactions", unit="reaction", leave=False) as pbar:
for row in rows:
parent_id = row["parent_message_row_id"]
reaction = row["reaction"]
chat_id = row["chat_jid_raw"]
_react_timestamp = row["sender_timestamp"]
if chat_id and chat_id in data:
chat = data[chat_id]
if parent_id in chat._messages:
message = chat._messages[parent_id]
# Determine sender name
sender_name = None
if row["from_me"]:
sender_name = "You"
elif row["sender_jid_raw"]:
sender_jid = row["sender_jid_raw"]
if sender_jid in data:
sender_name = data[sender_jid].name
if not sender_name:
sender_name = sender_jid.split('@')[0] if "@" in sender_jid else sender_jid
if not sender_name:
sender_name = "Unknown"
message.reactions[sender_name] = reaction
pbar.update(1)
total_time = pbar.format_dict['elapsed']
logger.info(f"Processed {total_row_number} reactions in {convert_time_unit(total_time)}{CLEAR_LINE}")
def media(db, data, media_folder, filter_date, filter_chat, filter_empty, separate_media=True, fix_dot_files=False):
""" """
Process WhatsApp media files from the database. Process WhatsApp media files from the database.
@@ -495,11 +592,10 @@ def media(db, data, media_folder, filter_date, filter_chat, filter_empty, separa
""" """
c = db.cursor() c = db.cursor()
total_row_number = _get_media_count(c, filter_empty, filter_date, filter_chat) total_row_number = _get_media_count(c, filter_empty, filter_date, filter_chat)
logger.info(f"Processing media...(0/{total_row_number})\r")
try: try:
content_cursor = _get_media_cursor_legacy(c, filter_empty, filter_date, filter_chat) content_cursor = _get_media_cursor_legacy(c, filter_empty, filter_date, filter_chat)
except sqlite3.OperationalError: except sqlite3.OperationalError as e:
logger.debug(f'Got sql error "{e}" in _get_media_cursor_legacy trying fallback.\n')
content_cursor = _get_media_cursor_new(c, filter_empty, filter_date, filter_chat) content_cursor = _get_media_cursor_new(c, filter_empty, filter_date, filter_chat)
content = content_cursor.fetchone() content = content_cursor.fetchone()
@@ -508,18 +604,12 @@ def media(db, data, media_folder, filter_date, filter_chat, filter_empty, separa
# Ensure thumbnails directory exists # Ensure thumbnails directory exists
Path(f"{media_folder}/thumbnails").mkdir(parents=True, exist_ok=True) Path(f"{media_folder}/thumbnails").mkdir(parents=True, exist_ok=True)
i = 0 with tqdm(total=total_row_number, desc="Processing media", unit="media", leave=False) as pbar:
while content is not None: while (content := _fetch_row_safely(content_cursor)) is not None:
_process_single_media(data, content, media_folder, mime, separate_media) _process_single_media(data, content, media_folder, mime, separate_media, fix_dot_files)
pbar.update(1)
i += 1 total_time = pbar.format_dict['elapsed']
if i % 100 == 0: logger.info(f"Processed {total_row_number} media in {convert_time_unit(total_time)}{CLEAR_LINE}")
logger.info(f"Processing media...({i}/{total_row_number})\r")
content = content_cursor.fetchone()
logger.info(f"Processed {total_row_number} media{CLEAR_LINE}")
# Helper functions for media processing # Helper functions for media processing
@@ -546,15 +636,18 @@ def _get_media_count(cursor, filter_empty, filter_date, filter_chat):
{date_filter} {date_filter}
{include_filter} {include_filter}
{exclude_filter}""") {exclude_filter}""")
except sqlite3.OperationalError: except sqlite3.OperationalError as e:
logger.debug(f'Got sql error "{e}" in _get_media_count trying fallback.\n')
empty_filter = get_cond_for_empty(filter_empty, "jid.raw_string", "broadcast") empty_filter = get_cond_for_empty(filter_empty, "jid.raw_string", "broadcast")
date_filter = f'AND message.timestamp {filter_date}' if filter_date is not None else '' date_filter = f'AND message.timestamp {filter_date}' if filter_date is not None else ''
include_filter = get_chat_condition( include_filter = get_chat_condition(
filter_chat[0], True, ["jid.raw_string", "jid_group.raw_string"], "jid", "android") filter_chat[0], True, ["key_remote_jid", "group_sender_jid"], "jid", "android")
exclude_filter = get_chat_condition( exclude_filter = get_chat_condition(
filter_chat[1], False, ["jid.raw_string", "jid_group.raw_string"], "jid", "android") filter_chat[1], False, ["key_remote_jid", "group_sender_jid"], "jid", "android")
cursor.execute(f"""SELECT count() cursor.execute(f"""SELECT count(),
COALESCE(lid_global.raw_string, jid.raw_string) as key_remote_jid,
COALESCE(lid_group.raw_string, jid_group.raw_string) as group_sender_jid
FROM message_media FROM message_media
INNER JOIN message INNER JOIN message
ON message_media.message_row_id = message._id ON message_media.message_row_id = message._id
@@ -564,6 +657,14 @@ def _get_media_count(cursor, filter_empty, filter_date, filter_chat):
ON jid._id = chat.jid_row_id ON jid._id = chat.jid_row_id
LEFT JOIN jid jid_group LEFT JOIN jid jid_group
ON jid_group._id = message.sender_jid_row_id ON jid_group._id = message.sender_jid_row_id
LEFT JOIN jid_map as jid_map_global
ON chat.jid_row_id = jid_map_global.lid_row_id
LEFT JOIN jid lid_global
ON jid_map_global.jid_row_id = lid_global._id
LEFT JOIN jid_map as jid_map_group
ON message.sender_jid_row_id = jid_map_group.lid_row_id
LEFT JOIN jid lid_group
ON jid_map_group.jid_row_id = lid_group._id
WHERE 1=1 WHERE 1=1
{empty_filter} {empty_filter}
{date_filter} {date_filter}
@@ -612,18 +713,19 @@ def _get_media_cursor_new(cursor, filter_empty, filter_date, filter_chat):
empty_filter = get_cond_for_empty(filter_empty, "key_remote_jid", "broadcast") empty_filter = get_cond_for_empty(filter_empty, "key_remote_jid", "broadcast")
date_filter = f'AND message.timestamp {filter_date}' if filter_date is not None else '' date_filter = f'AND message.timestamp {filter_date}' if filter_date is not None else ''
include_filter = get_chat_condition( include_filter = get_chat_condition(
filter_chat[0], True, ["key_remote_jid", "jid_group.raw_string"], "jid", "android") filter_chat[0], True, ["key_remote_jid", "group_sender_jid"], "jid", "android")
exclude_filter = get_chat_condition( exclude_filter = get_chat_condition(
filter_chat[1], False, ["key_remote_jid", "jid_group.raw_string"], "jid", "android") filter_chat[1], False, ["key_remote_jid", "group_sender_jid"], "jid", "android")
cursor.execute(f"""SELECT jid.raw_string as key_remote_jid, cursor.execute(f"""SELECT COALESCE(lid_global.raw_string, jid.raw_string) as key_remote_jid,
message_row_id, message_row_id,
file_path, file_path,
message_url, message_url,
mime_type, mime_type,
media_key, media_key,
file_hash, file_hash,
thumbnail thumbnail,
COALESCE(lid_group.raw_string, jid_group.raw_string) as group_sender_jid
FROM message_media FROM message_media
INNER JOIN message INNER JOIN message
ON message_media.message_row_id = message._id ON message_media.message_row_id = message._id
@@ -635,6 +737,14 @@ def _get_media_cursor_new(cursor, filter_empty, filter_date, filter_chat):
ON message_media.file_hash = media_hash_thumbnail.media_hash ON message_media.file_hash = media_hash_thumbnail.media_hash
LEFT JOIN jid jid_group LEFT JOIN jid jid_group
ON jid_group._id = message.sender_jid_row_id ON jid_group._id = message.sender_jid_row_id
LEFT JOIN jid_map as jid_map_global
ON chat.jid_row_id = jid_map_global.lid_row_id
LEFT JOIN jid lid_global
ON jid_map_global.jid_row_id = lid_global._id
LEFT JOIN jid_map as jid_map_group
ON message.sender_jid_row_id = jid_map_group.lid_row_id
LEFT JOIN jid lid_group
ON jid_map_group.jid_row_id = lid_group._id
WHERE jid.type <> 7 WHERE jid.type <> 7
{empty_filter} {empty_filter}
{date_filter} {date_filter}
@@ -644,7 +754,7 @@ def _get_media_cursor_new(cursor, filter_empty, filter_date, filter_chat):
return cursor return cursor
def _process_single_media(data, content, media_folder, mime, separate_media): def _process_single_media(data, content, media_folder, mime, separate_media, fix_dot_files=False):
"""Process a single media file.""" """Process a single media file."""
file_path = f"{media_folder}/{content['file_path']}" file_path = f"{media_folder}/{content['file_path']}"
current_chat = data.get_chat(content["key_remote_jid"]) current_chat = data.get_chat(content["key_remote_jid"])
@@ -652,8 +762,6 @@ def _process_single_media(data, content, media_folder, mime, separate_media):
message.media = True message.media = True
if os.path.isfile(file_path): if os.path.isfile(file_path):
message.data = file_path
# Set mime type # Set mime type
if content["mime_type"] is None: if content["mime_type"] is None:
guess = mime.guess_type(file_path)[0] guess = mime.guess_type(file_path)[0]
@@ -664,6 +772,16 @@ def _process_single_media(data, content, media_folder, mime, separate_media):
else: else:
message.mime = content["mime_type"] message.mime = content["mime_type"]
if fix_dot_files and file_path.endswith("."):
extension = mime.guess_extension(message.mime)
if message.mime == "application/octet-stream" or not extension:
new_file_path = file_path[:-1]
else:
extension = mime.guess_extension(message.mime)
new_file_path = file_path[:-1] + extension
os.rename(file_path, new_file_path)
file_path = new_file_path
# Copy media to separate folder if needed # Copy media to separate folder if needed
if separate_media: if separate_media:
chat_display_name = safe_name(current_chat.name or message.sender chat_display_name = safe_name(current_chat.name or message.sender
@@ -674,6 +792,8 @@ def _process_single_media(data, content, media_folder, mime, separate_media):
new_path = os.path.join(new_folder, current_filename) new_path = os.path.join(new_folder, current_filename)
shutil.copy2(file_path, new_path) shutil.copy2(file_path, new_path)
message.data = new_path message.data = new_path
else:
message.data = file_path
else: else:
message.data = "The media is missing" message.data = "The media is missing"
message.mime = "media" message.mime = "media"
@@ -693,49 +813,61 @@ def vcard(db, data, media_folder, filter_date, filter_chat, filter_empty):
c = db.cursor() c = db.cursor()
try: try:
rows = _execute_vcard_query_modern(c, filter_date, filter_chat, filter_empty) rows = _execute_vcard_query_modern(c, filter_date, filter_chat, filter_empty)
except sqlite3.OperationalError: except sqlite3.OperationalError as e:
logger.debug(f'Got sql error "{e}" in _execute_vcard_query_modern trying fallback.\n')
rows = _execute_vcard_query_legacy(c, filter_date, filter_chat, filter_empty) rows = _execute_vcard_query_legacy(c, filter_date, filter_chat, filter_empty)
total_row_number = len(rows) total_row_number = len(rows)
logger.info(f"Processing vCards...(0/{total_row_number})\r")
# Create vCards directory if it doesn't exist # Create vCards directory if it doesn't exist
path = os.path.join(media_folder, "vCards") path = os.path.join(media_folder, "vCards")
Path(path).mkdir(parents=True, exist_ok=True) Path(path).mkdir(parents=True, exist_ok=True)
for index, row in enumerate(rows): with tqdm(total=total_row_number, desc="Processing vCards", unit="vcard", leave=False) as pbar:
_process_vcard_row(row, path, data) for row in rows:
logger.info(f"Processing vCards...({index + 1}/{total_row_number})\r") _process_vcard_row(row, path, data)
logger.info(f"Processed {total_row_number} vCards{CLEAR_LINE}") pbar.update(1)
total_time = pbar.format_dict['elapsed']
logger.info(f"Processed {total_row_number} vCards in {convert_time_unit(total_time)}{CLEAR_LINE}")
def _execute_vcard_query_modern(c, filter_date, filter_chat, filter_empty): def _execute_vcard_query_modern(c, filter_date, filter_chat, filter_empty):
"""Execute vCard query for modern WhatsApp database schema.""" """Execute vCard query for modern WhatsApp database schema."""
# Build the filter conditions # Build the filter conditions
chat_filter_include = get_chat_condition(
filter_chat[0], True, ["messages.key_remote_jid", "remote_resource"], "jid", "android")
chat_filter_exclude = get_chat_condition(
filter_chat[1], False, ["messages.key_remote_jid", "remote_resource"], "jid", "android")
date_filter = f'AND messages.timestamp {filter_date}' if filter_date is not None else '' date_filter = f'AND messages.timestamp {filter_date}' if filter_date is not None else ''
empty_filter = get_cond_for_empty(filter_empty, "key_remote_jid", "messages.needs_push") empty_filter = get_cond_for_empty(filter_empty, "key_remote_jid", "messages.needs_push")
include_filter = get_chat_condition(
filter_chat[0], True, ["key_remote_jid", "group_sender_jid"], "jid", "android")
exclude_filter = get_chat_condition(
filter_chat[1], False, ["key_remote_jid", "group_sender_jid"], "jid", "android")
query = f"""SELECT message_row_id, query = f"""SELECT message_row_id,
messages.key_remote_jid, COALESCE(lid_global.raw_string, jid.raw_string) as key_remote_jid,
vcard, vcard,
messages.media_name messages.media_name,
FROM messages_vcards COALESCE(lid_group.raw_string, jid_group.raw_string) as group_sender_jid
INNER JOIN messages FROM messages_vcards
ON messages_vcards.message_row_id = messages._id INNER JOIN messages
INNER JOIN jid ON messages_vcards.message_row_id = messages._id
ON messages.key_remote_jid = jid.raw_string INNER JOIN jid
LEFT JOIN chat ON messages.key_remote_jid = jid.raw_string
ON chat.jid_row_id = jid._id LEFT JOIN chat
ON chat.jid_row_id = jid._id
LEFT JOIN jid jid_group
ON jid_group._id = message.sender_jid_row_id
LEFT JOIN jid_map as jid_map_global
ON chat.jid_row_id = jid_map_global.lid_row_id
LEFT JOIN jid lid_global
ON jid_map_global.jid_row_id = lid_global._id
LEFT JOIN jid_map as jid_map_group
ON message.sender_jid_row_id = jid_map_group.lid_row_id
LEFT JOIN jid lid_group
ON jid_map_group.jid_row_id = lid_group._id
WHERE 1=1 WHERE 1=1
{empty_filter} {empty_filter}
{date_filter} {date_filter}
{chat_filter_include} {include_filter}
{chat_filter_exclude} {exclude_filter}
ORDER BY messages.key_remote_jid ASC;""" ORDER BY messages.key_remote_jid ASC;"""
c.execute(query) c.execute(query)
return c.fetchall() return c.fetchall()
@@ -812,32 +944,37 @@ def calls(db, data, timezone_offset, filter_chat):
chat = ChatStore(Device.ANDROID, "WhatsApp Calls") chat = ChatStore(Device.ANDROID, "WhatsApp Calls")
# Process each call # Process each call
content = calls_data.fetchone() with tqdm(total=total_row_number, desc="Processing calls", unit="call", leave=False) as pbar:
while content is not None: while (content := _fetch_row_safely(calls_data)) is not None:
_process_call_record(content, chat, data, timezone_offset) _process_call_record(content, chat, data, timezone_offset)
content = calls_data.fetchone() pbar.update(1)
total_time = pbar.format_dict['elapsed']
# Add the calls chat to the data # Add the calls chat to the data
data.add_chat("000000000000000", chat) data.add_chat("000000000000000", chat)
logger.info(f"Processed {total_row_number} calls{CLEAR_LINE}") logger.info(f"Processed {total_row_number} calls in {convert_time_unit(total_time)}{CLEAR_LINE}")
def _get_calls_count(c, filter_chat): def _get_calls_count(c, filter_chat):
"""Get the count of call records that match the filter.""" """Get the count of call records that match the filter."""
# Build the filter conditions # Build the filter conditions
chat_filter_include = get_chat_condition(filter_chat[0], True, ["jid.raw_string"]) include_filter = get_chat_condition(filter_chat[0], True, ["key_remote_jid"])
chat_filter_exclude = get_chat_condition(filter_chat[1], False, ["jid.raw_string"]) exclude_filter = get_chat_condition(filter_chat[1], False, ["key_remote_jid"])
query = f"""SELECT count() query = f"""SELECT count(),
COALESCE(lid_global.raw_string, jid.raw_string) as key_remote_jid
FROM call_log FROM call_log
INNER JOIN jid INNER JOIN jid
ON call_log.jid_row_id = jid._id ON call_log.jid_row_id = jid._id
LEFT JOIN chat LEFT JOIN chat
ON call_log.jid_row_id = chat.jid_row_id ON call_log.jid_row_id = chat.jid_row_id
LEFT JOIN jid_map as jid_map_global
ON chat.jid_row_id = jid_map_global.lid_row_id
LEFT JOIN jid lid_global
ON jid_map_global.jid_row_id = lid_global._id
WHERE 1=1 WHERE 1=1
{chat_filter_include} {include_filter}
{chat_filter_exclude}""" {exclude_filter}"""
c.execute(query) c.execute(query)
return c.fetchone()[0] return c.fetchone()[0]
@@ -846,11 +983,11 @@ def _fetch_calls_data(c, filter_chat):
"""Fetch call data from the database.""" """Fetch call data from the database."""
# Build the filter conditions # Build the filter conditions
chat_filter_include = get_chat_condition(filter_chat[0], True, ["jid.raw_string"]) include_filter = get_chat_condition(filter_chat[0], True, ["key_remote_jid"])
chat_filter_exclude = get_chat_condition(filter_chat[1], False, ["jid.raw_string"]) exclude_filter = get_chat_condition(filter_chat[1], False, ["key_remote_jid"])
query = f"""SELECT call_log._id, query = f"""SELECT call_log._id,
jid.raw_string, COALESCE(lid_global.raw_string, jid.raw_string) as key_remote_jid,
from_me, from_me,
call_id, call_id,
timestamp, timestamp,
@@ -864,9 +1001,13 @@ def _fetch_calls_data(c, filter_chat):
ON call_log.jid_row_id = jid._id ON call_log.jid_row_id = jid._id
LEFT JOIN chat LEFT JOIN chat
ON call_log.jid_row_id = chat.jid_row_id ON call_log.jid_row_id = chat.jid_row_id
LEFT JOIN jid_map as jid_map_global
ON chat.jid_row_id = jid_map_global.lid_row_id
LEFT JOIN jid lid_global
ON jid_map_global.jid_row_id = lid_global._id
WHERE 1=1 WHERE 1=1
{chat_filter_include} {include_filter}
{chat_filter_exclude}""" {exclude_filter}"""
c.execute(query) c.execute(query)
return c return c
@@ -878,13 +1019,13 @@ def _process_call_record(content, chat, data, timezone_offset):
timestamp=content["timestamp"], timestamp=content["timestamp"],
time=content["timestamp"], time=content["timestamp"],
key_id=content["call_id"], key_id=content["call_id"],
timezone_offset=timezone_offset if timezone_offset else CURRENT_TZ_OFFSET, timezone_offset=timezone_offset,
received_timestamp=None, # TODO: Add timestamp received_timestamp=None, # TODO: Add timestamp
read_timestamp=None # TODO: Add timestamp read_timestamp=None # TODO: Add timestamp
) )
# Get caller/callee name # Get caller/callee name
_jid = content["raw_string"] _jid = content["key_remote_jid"]
name = data.get_chat(_jid).name if _jid in data else content["chat_subject"] or None name = data.get_chat(_jid).name if _jid in data else content["chat_subject"] or None
if _jid is not None and "@" in _jid: if _jid is not None and "@" in _jid:
fallback = _jid.split('@')[0] fallback = _jid.split('@')[0]
@@ -929,6 +1070,7 @@ def _construct_call_description(content, call):
return description return description
# TODO: Marked for enhancement on multi-threaded processing
def create_html( def create_html(
data, data,
output_folder, output_folder,
@@ -944,7 +1086,6 @@ def create_html(
template = setup_template(template, no_avatar, experimental) template = setup_template(template, no_avatar, experimental)
total_row_number = len(data) total_row_number = len(data)
logger.info(f"Generating chats...(0/{total_row_number})\r")
# Create output directory if it doesn't exist # Create output directory if it doesn't exist
if not os.path.isdir(output_folder): if not os.path.isdir(output_folder):
@@ -952,43 +1093,42 @@ def create_html(
w3css = get_status_location(output_folder, offline_static) w3css = get_status_location(output_folder, offline_static)
for current, contact in enumerate(data): with tqdm(total=total_row_number, desc="Generating HTML", unit="file", leave=False) as pbar:
current_chat = data.get_chat(contact) for contact in data:
if len(current_chat) == 0: current_chat = data.get_chat(contact)
# Skip empty chats if len(current_chat) == 0:
continue # Skip empty chats
continue
safe_file_name, name = get_file_name(contact, current_chat) safe_file_name, name = get_file_name(contact, current_chat)
if maximum_size is not None: if maximum_size is not None:
_generate_paginated_chat( _generate_paginated_chat(
current_chat, current_chat,
safe_file_name, safe_file_name,
name, name,
contact, contact,
output_folder, output_folder,
template, template,
w3css, w3css,
maximum_size, maximum_size,
headline headline
) )
else: else:
_generate_single_chat( _generate_single_chat(
current_chat, current_chat,
safe_file_name, safe_file_name,
name, name,
contact, contact,
output_folder, output_folder,
template, template,
w3css, w3css,
headline headline
) )
if current % 10 == 0:
logger.info(f"Generating chats...({current}/{total_row_number})\r")
logger.info(f"Generated {total_row_number} chats{CLEAR_LINE}")
pbar.update(1)
total_time = pbar.format_dict['elapsed']
logger.info(f"Generated {total_row_number} chats in {convert_time_unit(total_time)}{CLEAR_LINE}")
def _generate_single_chat(current_chat, safe_file_name, name, contact, output_folder, template, w3css, headline): def _generate_single_chat(current_chat, safe_file_name, name, contact, output_folder, template, w3css, headline):
"""Generate a single HTML file for a chat.""" """Generate a single HTML file for a chat."""

View File

@@ -66,6 +66,7 @@ class ChatCollection(MutableMapping):
def __init__(self) -> None: def __init__(self) -> None:
"""Initialize an empty chat collection.""" """Initialize an empty chat collection."""
self._chats: Dict[str, ChatStore] = {} self._chats: Dict[str, ChatStore] = {}
self._system: Dict[str, Any] = {}
def __getitem__(self, key: str) -> 'ChatStore': def __getitem__(self, key: str) -> 'ChatStore':
"""Get a chat by its ID. Required for dict-like access.""" """Get a chat by its ID. Required for dict-like access."""
@@ -148,6 +149,28 @@ class ChatCollection(MutableMapping):
""" """
return {chat_id: chat.to_json() for chat_id, chat in self._chats.items()} return {chat_id: chat.to_json() for chat_id, chat in self._chats.items()}
def get_system(self, key: str) -> Any:
"""
Get a system value by its key.
Args:
key (str): The key of the system value to retrieve
Returns:
Any: The system value if found, None otherwise
"""
return self._system.get(key)
def set_system(self, key: str, value: Any) -> None:
"""
Set a system value by its key.
Args:
key (str): The key of the system value to set
value (Any): The value to set
"""
self._system[key] = value
class ChatStore: class ChatStore:
""" """
@@ -279,7 +302,7 @@ class Message:
key_id: Union[int, str], key_id: Union[int, str],
received_timestamp: int = None, received_timestamp: int = None,
read_timestamp: int = None, read_timestamp: int = None,
timezone_offset: int = 0, timezone_offset: Optional[Timing] = Timing(0),
message_type: Optional[int] = None message_type: Optional[int] = None
) -> None: ) -> None:
""" """
@@ -300,10 +323,9 @@ class Message:
""" """
self.from_me = bool(from_me) self.from_me = bool(from_me)
self.timestamp = timestamp / 1000 if timestamp > 9999999999 else timestamp self.timestamp = timestamp / 1000 if timestamp > 9999999999 else timestamp
timing = Timing(timezone_offset)
if isinstance(time, (int, float)): if isinstance(time, (int, float)):
self.time = timing.format_timestamp(self.timestamp, "%H:%M") self.time = timezone_offset.format_timestamp(self.timestamp, "%H:%M")
elif isinstance(time, str): elif isinstance(time, str):
self.time = time self.time = time
else: else:
@@ -318,14 +340,14 @@ class Message:
self.mime = None self.mime = None
self.message_type = message_type self.message_type = message_type
if isinstance(received_timestamp, (int, float)): if isinstance(received_timestamp, (int, float)):
self.received_timestamp = timing.format_timestamp( self.received_timestamp = timezone_offset.format_timestamp(
received_timestamp, "%Y/%m/%d %H:%M") received_timestamp, "%Y/%m/%d %H:%M")
elif isinstance(received_timestamp, str): elif isinstance(received_timestamp, str):
self.received_timestamp = received_timestamp self.received_timestamp = received_timestamp
else: else:
self.received_timestamp = None self.received_timestamp = None
if isinstance(read_timestamp, (int, float)): if isinstance(read_timestamp, (int, float)):
self.read_timestamp = timing.format_timestamp( self.read_timestamp = timezone_offset.format_timestamp(
read_timestamp, "%Y/%m/%d %H:%M") read_timestamp, "%Y/%m/%d %H:%M")
elif isinstance(read_timestamp, str): elif isinstance(read_timestamp, str):
self.read_timestamp = read_timestamp self.read_timestamp = read_timestamp
@@ -338,6 +360,7 @@ class Message:
self.caption = None self.caption = None
self.thumb = None # Android specific self.thumb = None # Android specific
self.sticker = False self.sticker = False
self.reactions = {}
def to_json(self) -> Dict[str, Any]: def to_json(self) -> Dict[str, Any]:
"""Convert message to JSON-serializable dict.""" """Convert message to JSON-serializable dict."""

View File

@@ -4,8 +4,9 @@ import os
import logging import logging
from datetime import datetime from datetime import datetime
from mimetypes import MimeTypes from mimetypes import MimeTypes
from tqdm import tqdm
from Whatsapp_Chat_Exporter.data_model import ChatStore, Message from Whatsapp_Chat_Exporter.data_model import ChatStore, Message
from Whatsapp_Chat_Exporter.utility import CLEAR_LINE, Device from Whatsapp_Chat_Exporter.utility import CLEAR_LINE, Device, convert_time_unit
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -34,17 +35,16 @@ def messages(path, data, assume_first_as_me=False):
# Second pass: process the messages # Second pass: process the messages
with open(path, "r", encoding="utf8") as file: with open(path, "r", encoding="utf8") as file:
for index, line in enumerate(file): with tqdm(total=total_row_number, desc="Processing messages & media", unit="msg&media", leave=False) as pbar:
you, user_identification_done = process_line( for index, line in enumerate(file):
line, index, chat, path, you, you, user_identification_done = process_line(
assume_first_as_me, user_identification_done line, index, chat, path, you,
) assume_first_as_me, user_identification_done
)
pbar.update(1)
total_time = pbar.format_dict['elapsed']
logger.info(f"Processed {total_row_number} messages & media in {convert_time_unit(total_time)}{CLEAR_LINE}")
# Show progress
if index % 1000 == 0:
logger.info(f"Processing messages & media...({index}/{total_row_number})\r")
logger.info(f"Processed {total_row_number} messages & media{CLEAR_LINE}")
return data return data

View File

@@ -4,12 +4,13 @@ import os
import logging import logging
import shutil import shutil
from glob import glob from glob import glob
from tqdm import tqdm
from pathlib import Path from pathlib import Path
from mimetypes import MimeTypes from mimetypes import MimeTypes
from markupsafe import escape as htmle from markupsafe import escape as htmle
from Whatsapp_Chat_Exporter.data_model import ChatStore, Message from Whatsapp_Chat_Exporter.data_model import ChatStore, Message
from Whatsapp_Chat_Exporter.utility import APPLE_TIME, CLEAR_LINE, CURRENT_TZ_OFFSET, get_chat_condition from Whatsapp_Chat_Exporter.utility import APPLE_TIME, CLEAR_LINE, get_chat_condition, Device
from Whatsapp_Chat_Exporter.utility import bytes_to_readable, convert_time_unit, safe_name, Device from Whatsapp_Chat_Exporter.utility import bytes_to_readable, convert_time_unit, safe_name
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -23,17 +24,18 @@ def contacts(db, data):
logger.info(f"Pre-processing contacts...({total_row_number})\r") logger.info(f"Pre-processing contacts...({total_row_number})\r")
c.execute("""SELECT ZWHATSAPPID, ZABOUTTEXT FROM ZWAADDRESSBOOKCONTACT WHERE ZABOUTTEXT IS NOT NULL""") c.execute("""SELECT ZWHATSAPPID, ZABOUTTEXT FROM ZWAADDRESSBOOKCONTACT WHERE ZABOUTTEXT IS NOT NULL""")
content = c.fetchone() with tqdm(total=total_row_number, desc="Processing contacts", unit="contact", leave=False) as pbar:
while content is not None: while (content := c.fetchone()) is not None:
zwhatsapp_id = content["ZWHATSAPPID"] zwhatsapp_id = content["ZWHATSAPPID"]
if not zwhatsapp_id.endswith("@s.whatsapp.net"): if not zwhatsapp_id.endswith("@s.whatsapp.net"):
zwhatsapp_id += "@s.whatsapp.net" zwhatsapp_id += "@s.whatsapp.net"
current_chat = ChatStore(Device.IOS) current_chat = ChatStore(Device.IOS)
current_chat.status = content["ZABOUTTEXT"] current_chat.status = content["ZABOUTTEXT"]
data.add_chat(zwhatsapp_id, current_chat) data.add_chat(zwhatsapp_id, current_chat)
content = c.fetchone() pbar.update(1)
logger.info(f"Pre-processed {total_row_number} contacts{CLEAR_LINE}") total_time = pbar.format_dict['elapsed']
logger.info(f"Pre-processed {total_row_number} contacts in {convert_time_unit(total_time)}{CLEAR_LINE}")
def process_contact_avatars(current_chat, media_folder, contact_id): def process_contact_avatars(current_chat, media_folder, contact_id):
@@ -92,7 +94,6 @@ def messages(db, data, media_folder, timezone_offset, filter_date, filter_chat,
""" """
c.execute(contact_query) c.execute(contact_query)
total_row_number = c.fetchone()[0] total_row_number = c.fetchone()[0]
logger.info(f"Processing contacts...({total_row_number})\r")
# Get distinct contacts # Get distinct contacts
contacts_query = f""" contacts_query = f"""
@@ -114,24 +115,24 @@ def messages(db, data, media_folder, timezone_offset, filter_date, filter_chat,
c.execute(contacts_query) c.execute(contacts_query)
# Process each contact # Process each contact
content = c.fetchone() with tqdm(total=total_row_number, desc="Processing contacts", unit="contact", leave=False) as pbar:
while content is not None: while (content := c.fetchone()) is not None:
contact_name = get_contact_name(content) contact_name = get_contact_name(content)
contact_id = content["ZCONTACTJID"] contact_id = content["ZCONTACTJID"]
# Add or update chat # Add or update chat
if contact_id not in data: if contact_id not in data:
current_chat = data.add_chat(contact_id, ChatStore(Device.IOS, contact_name, media_folder)) current_chat = data.add_chat(contact_id, ChatStore(Device.IOS, contact_name, media_folder))
else: else:
current_chat = data.get_chat(contact_id) current_chat = data.get_chat(contact_id)
current_chat.name = contact_name current_chat.name = contact_name
current_chat.my_avatar = os.path.join(media_folder, "Media/Profile/Photo.jpg") current_chat.my_avatar = os.path.join(media_folder, "Media/Profile/Photo.jpg")
# Process avatar images # Process avatar images
process_contact_avatars(current_chat, media_folder, contact_id) process_contact_avatars(current_chat, media_folder, contact_id)
content = c.fetchone() pbar.update(1)
total_time = pbar.format_dict['elapsed']
logger.info(f"Processed {total_row_number} contacts{CLEAR_LINE}") logger.info(f"Processed {total_row_number} contacts in {convert_time_unit(total_time)}{CLEAR_LINE}")
# Get message count # Get message count
message_count_query = f""" message_count_query = f"""
@@ -178,50 +179,57 @@ def messages(db, data, media_folder, timezone_offset, filter_date, filter_chat,
""" """
c.execute(messages_query) c.execute(messages_query)
reply_query = """SELECT ZSTANZAID,
ZTEXT,
ZTITLE
FROM ZWAMESSAGE
LEFT JOIN ZWAMEDIAITEM
ON ZWAMESSAGE.Z_PK = ZWAMEDIAITEM.ZMESSAGE
WHERE ZTEXT IS NOT NULL
OR ZTITLE IS NOT NULL;"""
cursor2.execute(reply_query)
message_map = {row[0][:17]: row[1] or row[2] for row in cursor2.fetchall() if row[0]}
# Process each message # Process each message
i = 0 with tqdm(total=total_row_number, desc="Processing messages", unit="msg", leave=False) as pbar:
content = c.fetchone() while (content := c.fetchone()) is not None:
while content is not None: contact_id = content["ZCONTACTJID"]
contact_id = content["ZCONTACTJID"] message_pk = content["Z_PK"]
message_pk = content["Z_PK"] is_group_message = content["ZGROUPINFO"] is not None
is_group_message = content["ZGROUPINFO"] is not None
# Ensure chat exists # Ensure chat exists
if contact_id not in data: if contact_id not in data:
current_chat = data.add_chat(contact_id, ChatStore(Device.IOS)) current_chat = data.add_chat(contact_id, ChatStore(Device.IOS))
process_contact_avatars(current_chat, media_folder, contact_id) process_contact_avatars(current_chat, media_folder, contact_id)
else: else:
current_chat = data.get_chat(contact_id) current_chat = data.get_chat(contact_id)
# Create message object # Create message object
ts = APPLE_TIME + content["ZMESSAGEDATE"] ts = APPLE_TIME + content["ZMESSAGEDATE"]
message = Message( message = Message(
from_me=content["ZISFROMME"], from_me=content["ZISFROMME"],
timestamp=ts, timestamp=ts,
time=ts, time=ts,
key_id=content["ZSTANZAID"][:17], key_id=content["ZSTANZAID"][:17],
timezone_offset=timezone_offset if timezone_offset else CURRENT_TZ_OFFSET, timezone_offset=timezone_offset,
message_type=content["ZMESSAGETYPE"], message_type=content["ZMESSAGETYPE"],
received_timestamp=APPLE_TIME + content["ZSENTDATE"] if content["ZSENTDATE"] else None, received_timestamp=APPLE_TIME + content["ZSENTDATE"] if content["ZSENTDATE"] else None,
read_timestamp=None # TODO: Add timestamp read_timestamp=None # TODO: Add timestamp
) )
# Process message data # Process message data
invalid = process_message_data(message, content, is_group_message, data, cursor2, no_reply) invalid = process_message_data(message, content, is_group_message, data, message_map, no_reply)
# Add valid messages to chat # Add valid messages to chat
if not invalid: if not invalid:
current_chat.add_message(message_pk, message) current_chat.add_message(message_pk, message)
# Update progress pbar.update(1)
i += 1 total_time = pbar.format_dict['elapsed']
if i % 1000 == 0: logger.info(f"Processed {total_row_number} messages in {convert_time_unit(total_time)}{CLEAR_LINE}")
logger.info(f"Processing messages...({i}/{total_row_number})\r")
content = c.fetchone()
logger.info(f"Processed {total_row_number} messages{CLEAR_LINE}")
def process_message_data(message, content, is_group_message, data, cursor2, no_reply): def process_message_data(message, content, is_group_message, data, message_map, no_reply):
"""Process and set message data from content row.""" """Process and set message data from content row."""
# Handle group sender info # Handle group sender info
if is_group_message and content["ZISFROMME"] == 0: if is_group_message and content["ZISFROMME"] == 0:
@@ -247,14 +255,7 @@ def process_message_data(message, content, is_group_message, data, cursor2, no_r
if content["ZMETADATA"] is not None and content["ZMETADATA"].startswith(b"\x2a\x14") and not no_reply: if content["ZMETADATA"] is not None and content["ZMETADATA"].startswith(b"\x2a\x14") and not no_reply:
quoted = content["ZMETADATA"][2:19] quoted = content["ZMETADATA"][2:19]
message.reply = quoted.decode() message.reply = quoted.decode()
cursor2.execute(f"""SELECT ZTEXT message.quoted_data = message_map.get(message.reply)
FROM ZWAMESSAGE
WHERE ZSTANZAID LIKE '{message.reply}%'""")
quoted_content = cursor2.fetchone()
if quoted_content and "ZTEXT" in quoted_content:
message.quoted_data = quoted_content["ZTEXT"]
else:
message.quoted_data = None
# Handle stickers # Handle stickers
if content["ZMESSAGETYPE"] == 15: if content["ZMESSAGETYPE"] == 15:
@@ -311,7 +312,7 @@ def process_message_text(message, content):
message.data = msg message.data = msg
def media(db, data, media_folder, filter_date, filter_chat, filter_empty, separate_media=False): def media(db, data, media_folder, filter_date, filter_chat, filter_empty, separate_media=False, fix_dot_files=False):
"""Process media files from WhatsApp messages.""" """Process media files from WhatsApp messages."""
c = db.cursor() c = db.cursor()
@@ -367,20 +368,15 @@ def media(db, data, media_folder, filter_date, filter_chat, filter_empty, separa
# Process each media item # Process each media item
mime = MimeTypes() mime = MimeTypes()
i = 0 with tqdm(total=total_row_number, desc="Processing media", unit="media", leave=False) as pbar:
content = c.fetchone() while (content := c.fetchone()) is not None:
while content is not None: process_media_item(content, data, media_folder, mime, separate_media, fix_dot_files)
process_media_item(content, data, media_folder, mime, separate_media) pbar.update(1)
total_time = pbar.format_dict['elapsed']
# Update progress logger.info(f"Processed {total_row_number} media in {convert_time_unit(total_time)}{CLEAR_LINE}")
i += 1
if i % 100 == 0:
logger.info(f"Processing media...({i}/{total_row_number})\r")
content = c.fetchone()
logger.info(f"Processed {total_row_number} media{CLEAR_LINE}")
def process_media_item(content, data, media_folder, mime, separate_media): def process_media_item(content, data, media_folder, mime, separate_media, fix_dot_files=False):
"""Process a single media item.""" """Process a single media item."""
file_path = f"{media_folder}/Message/{content['ZMEDIALOCALPATH']}" file_path = f"{media_folder}/Message/{content['ZMEDIALOCALPATH']}"
current_chat = data.get_chat(content["ZCONTACTJID"]) current_chat = data.get_chat(content["ZCONTACTJID"])
@@ -391,8 +387,6 @@ def process_media_item(content, data, media_folder, mime, separate_media):
current_chat.media_base = media_folder + "/" current_chat.media_base = media_folder + "/"
if os.path.isfile(file_path): if os.path.isfile(file_path):
message.data = '/'.join(file_path.split("/")[1:])
# Set MIME type # Set MIME type
if content["ZVCARDSTRING"] is None: if content["ZVCARDSTRING"] is None:
guess = mime.guess_type(file_path)[0] guess = mime.guess_type(file_path)[0]
@@ -400,6 +394,16 @@ def process_media_item(content, data, media_folder, mime, separate_media):
else: else:
message.mime = content["ZVCARDSTRING"] message.mime = content["ZVCARDSTRING"]
if fix_dot_files and file_path.endswith("."):
extension = mime.guess_extension(message.mime)
if message.mime == "application/octet-stream" or not extension:
new_file_path = file_path[:-1]
else:
extension = mime.guess_extension(message.mime)
new_file_path = file_path[:-1] + extension
os.rename(file_path, new_file_path)
file_path = new_file_path
# Handle separate media option # Handle separate media option
if separate_media: if separate_media:
chat_display_name = safe_name( chat_display_name = safe_name(
@@ -409,7 +413,9 @@ def process_media_item(content, data, media_folder, mime, separate_media):
Path(new_folder).mkdir(parents=True, exist_ok=True) Path(new_folder).mkdir(parents=True, exist_ok=True)
new_path = os.path.join(new_folder, current_filename) new_path = os.path.join(new_folder, current_filename)
shutil.copy2(file_path, new_path) shutil.copy2(file_path, new_path)
message.data = '/'.join(new_path.split("\\")[1:]) message.data = '/'.join(new_path.split("/")[1:])
else:
message.data = '/'.join(file_path.split("/")[1:])
else: else:
# Handle missing media # Handle missing media
message.data = "The media is missing" message.data = "The media is missing"
@@ -463,10 +469,12 @@ def vcard(db, data, media_folder, filter_date, filter_chat, filter_empty):
Path(path).mkdir(parents=True, exist_ok=True) Path(path).mkdir(parents=True, exist_ok=True)
# Process each vCard # Process each vCard
for index, content in enumerate(contents): with tqdm(total=total_row_number, desc="Processing vCards", unit="vcard", leave=False) as pbar:
process_vcard_item(content, path, data) for content in contents:
logger.info(f"Processing vCards...({index + 1}/{total_row_number})\r") process_vcard_item(content, path, data)
logger.info(f"Processed {total_row_number} vCards{CLEAR_LINE}") pbar.update(1)
total_time = pbar.format_dict['elapsed']
logger.info(f"Processed {total_row_number} vCards in {convert_time_unit(total_time)}{CLEAR_LINE}")
def process_vcard_item(content, path, data): def process_vcard_item(content, path, data):
@@ -526,8 +534,6 @@ def calls(db, data, timezone_offset, filter_chat):
if total_row_number == 0: if total_row_number == 0:
return return
logger.info(f"Processed {total_row_number} calls{CLEAR_LINE}\n")
# Fetch call records # Fetch call records
calls_query = f""" calls_query = f"""
SELECT ZCALLIDSTRING, SELECT ZCALLIDSTRING,
@@ -552,14 +558,15 @@ def calls(db, data, timezone_offset, filter_chat):
# Create calls chat # Create calls chat
chat = ChatStore(Device.ANDROID, "WhatsApp Calls") chat = ChatStore(Device.ANDROID, "WhatsApp Calls")
# Process each call with tqdm(total=total_row_number, desc="Processing calls", unit="call", leave=False) as pbar:
content = c.fetchone() while (content := c.fetchone()) is not None:
while content is not None: process_call_record(content, chat, data, timezone_offset)
process_call_record(content, chat, data, timezone_offset) pbar.update(1)
content = c.fetchone() total_time = pbar.format_dict['elapsed']
# Add calls chat to data # Add calls chat to data
data.add_chat("000000000000000", chat) data.add_chat("000000000000000", chat)
logger.info(f"Processed {total_row_number} calls in {convert_time_unit(total_time)}{CLEAR_LINE}")
def process_call_record(content, chat, data, timezone_offset): def process_call_record(content, chat, data, timezone_offset):
@@ -570,7 +577,7 @@ def process_call_record(content, chat, data, timezone_offset):
timestamp=ts, timestamp=ts,
time=ts, time=ts,
key_id=content["ZCALLIDSTRING"], key_id=content["ZCALLIDSTRING"],
timezone_offset=timezone_offset if timezone_offset else CURRENT_TZ_OFFSET timezone_offset=timezone_offset
) )
# Set sender info # Set sender info

View File

@@ -6,7 +6,9 @@ import sqlite3
import os import os
import getpass import getpass
from sys import exit, platform as osname from sys import exit, platform as osname
from Whatsapp_Chat_Exporter.utility import CLEAR_LINE, WhatsAppIdentifier import sys
from tqdm import tqdm
from Whatsapp_Chat_Exporter.utility import CLEAR_LINE, WhatsAppIdentifier, convert_time_unit
from Whatsapp_Chat_Exporter.bplist import BPListReader from Whatsapp_Chat_Exporter.bplist import BPListReader
try: try:
from iphone_backup_decrypt import EncryptedBackup, RelativePath from iphone_backup_decrypt import EncryptedBackup, RelativePath
@@ -79,6 +81,8 @@ class BackupExtractor:
logger.info(f"Encryption detected on the backup!{CLEAR_LINE}") logger.info(f"Encryption detected on the backup!{CLEAR_LINE}")
password = getpass.getpass("Enter the password for the backup:") password = getpass.getpass("Enter the password for the backup:")
sys.stdout.write("\033[F\033[K")
sys.stdout.flush()
self._decrypt_backup(password) self._decrypt_backup(password)
self._extract_decrypted_files() self._extract_decrypted_files()
@@ -89,7 +93,7 @@ class BackupExtractor:
Args: Args:
password (str): The password for the encrypted backup. password (str): The password for the encrypted backup.
""" """
logger.info(f"Trying to decrypt the iOS backup...{CLEAR_LINE}") logger.info(f"Trying to open the iOS backup...{CLEAR_LINE}")
self.backup = EncryptedBackup( self.backup = EncryptedBackup(
backup_directory=self.base_dir, backup_directory=self.base_dir,
passphrase=password, passphrase=password,
@@ -97,7 +101,7 @@ class BackupExtractor:
check_same_thread=False, check_same_thread=False,
decrypt_chunk_size=self.decrypt_chunk_size, decrypt_chunk_size=self.decrypt_chunk_size,
) )
logger.info(f"iOS backup decrypted successfully{CLEAR_LINE}") logger.info(f"iOS backup is opened successfully{CLEAR_LINE}")
logger.info("Decrypting WhatsApp database...\r") logger.info("Decrypting WhatsApp database...\r")
try: try:
self.backup.extract_file( self.backup.extract_file(
@@ -130,9 +134,12 @@ class BackupExtractor:
def _extract_decrypted_files(self): def _extract_decrypted_files(self):
"""Extract all WhatsApp files after decryption""" """Extract all WhatsApp files after decryption"""
pbar = tqdm(desc="Decrypting and extracting files", unit="file", leave=False)
def extract_progress_handler(file_id, domain, relative_path, n, total_files): def extract_progress_handler(file_id, domain, relative_path, n, total_files):
if n % 100 == 0: if pbar.total is None:
logger.info(f"Decrypting and extracting files...({n}/{total_files})\r") pbar.total = total_files
pbar.n = n
pbar.refresh()
return True return True
self.backup.extract_files( self.backup.extract_files(
@@ -141,7 +148,9 @@ class BackupExtractor:
preserve_folders=True, preserve_folders=True,
filter_callback=extract_progress_handler filter_callback=extract_progress_handler
) )
logger.info(f"All required files are decrypted and extracted.{CLEAR_LINE}") total_time = pbar.format_dict['elapsed']
pbar.close()
logger.info(f"All required files are decrypted and extracted in {convert_time_unit(total_time)}{CLEAR_LINE}")
def _extract_unencrypted_backup(self): def _extract_unencrypted_backup(self):
""" """
@@ -192,7 +201,6 @@ class BackupExtractor:
c = manifest.cursor() c = manifest.cursor()
c.execute(f"SELECT count() FROM Files WHERE domain = '{_wts_id}'") c.execute(f"SELECT count() FROM Files WHERE domain = '{_wts_id}'")
total_row_number = c.fetchone()[0] total_row_number = c.fetchone()[0]
logger.info(f"Extracting WhatsApp files...(0/{total_row_number})\r")
c.execute( c.execute(
f""" f"""
SELECT fileID, relativePath, flags, file AS metadata, SELECT fileID, relativePath, flags, file AS metadata,
@@ -205,33 +213,30 @@ class BackupExtractor:
if not os.path.isdir(_wts_id): if not os.path.isdir(_wts_id):
os.mkdir(_wts_id) os.mkdir(_wts_id)
row = c.fetchone() with tqdm(total=total_row_number, desc="Extracting WhatsApp files", unit="file", leave=False) as pbar:
while row is not None: while (row := c.fetchone()) is not None:
if not row["relativePath"]: # Skip empty relative paths if not row["relativePath"]: # Skip empty relative paths
row = c.fetchone() continue
continue
destination = os.path.join(_wts_id, row["relativePath"]) destination = os.path.join(_wts_id, row["relativePath"])
hashes = row["fileID"] hashes = row["fileID"]
folder = hashes[:2] folder = hashes[:2]
flags = row["flags"] flags = row["flags"]
if flags == 2: # Directory if flags == 2: # Directory
try: try:
os.mkdir(destination) os.mkdir(destination)
except FileExistsError: except FileExistsError:
pass pass
elif flags == 1: # File elif flags == 1: # File
shutil.copyfile(os.path.join(self.base_dir, folder, hashes), destination) shutil.copyfile(os.path.join(self.base_dir, folder, hashes), destination)
metadata = BPListReader(row["metadata"]).parse() metadata = BPListReader(row["metadata"]).parse()
creation = metadata["$objects"][1]["Birth"] _creation = metadata["$objects"][1]["Birth"]
modification = metadata["$objects"][1]["LastModified"] modification = metadata["$objects"][1]["LastModified"]
os.utime(destination, (modification, modification)) os.utime(destination, (modification, modification))
pbar.update(1)
if row["_index"] % 100 == 0: total_time = pbar.format_dict['elapsed']
logger.info(f"Extracting WhatsApp files...({row['_index']}/{total_row_number})\r") logger.info(f"Extracted {total_row_number} WhatsApp files in {convert_time_unit(total_time)}{CLEAR_LINE}")
row = c.fetchone()
logger.info(f"Extracted WhatsApp files...({total_row_number}){CLEAR_LINE}")
def extract_media(base_dir, identifiers, decrypt_chunk_size): def extract_media(base_dir, identifiers, decrypt_chunk_size):

View File

@@ -5,13 +5,13 @@ import json
import os import os
import unicodedata import unicodedata
import re import re
import string
import math import math
import shutil import shutil
from bleach import clean as sanitize from bleach import clean as sanitize
from markupsafe import Markup from markupsafe import Markup
from datetime import datetime, timedelta from datetime import datetime, timedelta
from enum import IntEnum from enum import IntEnum
from tqdm import tqdm
from Whatsapp_Chat_Exporter.data_model import ChatCollection, ChatStore, Timing from Whatsapp_Chat_Exporter.data_model import ChatCollection, ChatStore, Timing
from typing import Dict, List, Optional, Tuple, Union from typing import Dict, List, Optional, Tuple, Union
try: try:
@@ -248,13 +248,13 @@ def import_from_json(json_file: str, data: ChatCollection):
with open(json_file, "r") as f: with open(json_file, "r") as f:
temp_data = json.loads(f.read()) temp_data = json.loads(f.read())
total_row_number = len(tuple(temp_data.keys())) total_row_number = len(tuple(temp_data.keys()))
logger.info(f"Importing chats from JSON...(0/{total_row_number})\r") with tqdm(total=total_row_number, desc="Importing chats from JSON", unit="chat", leave=False) as pbar:
for index, (jid, chat_data) in enumerate(temp_data.items()): for jid, chat_data in temp_data.items():
chat = ChatStore.from_json(chat_data) chat = ChatStore.from_json(chat_data)
data.add_chat(jid, chat) data.add_chat(jid, chat)
logger.info( pbar.update(1)
f"Importing chats from JSON...({index + 1}/{total_row_number})\r") total_time = pbar.format_dict['elapsed']
logger.info(f"Imported {total_row_number} chats from JSON{CLEAR_LINE}") logger.info(f"Imported {total_row_number} chats from JSON in {convert_time_unit(total_time)}{CLEAR_LINE}")
def incremental_merge(source_dir: str, target_dir: str, media_dir: str, pretty_print_json: int, avoid_encoding_json: bool): def incremental_merge(source_dir: str, target_dir: str, media_dir: str, pretty_print_json: int, avoid_encoding_json: bool):
@@ -439,7 +439,7 @@ CRYPT14_OFFSETS = (
{"iv": 67, "db": 193}, {"iv": 67, "db": 193},
{"iv": 67, "db": 194}, {"iv": 67, "db": 194},
{"iv": 67, "db": 158}, {"iv": 67, "db": 158},
{"iv": 67, "db": 196} {"iv": 67, "db": 196},
) )
@@ -534,7 +534,7 @@ def determine_metadata(content: sqlite3.Row, init_msg: Optional[str]) -> Optiona
else: else:
msg = "The security code in this chat changed" msg = "The security code in this chat changed"
elif content["action_type"] == 58: elif content["action_type"] == 58:
msg = "You blocked this contact" msg = "You blocked/unblocked this contact"
elif content["action_type"] == 67: elif content["action_type"] == 67:
return # (PM) this contact use secure service from Facebook??? return # (PM) this contact use secure service from Facebook???
elif content["action_type"] == 69: elif content["action_type"] == 69:
@@ -572,6 +572,68 @@ def get_status_location(output_folder: str, offline_static: str) -> str:
return w3css return w3css
def check_jid_map(db: sqlite3.Connection) -> bool:
"""
Checks if the jid_map table exists in the database.
Args:
db (sqlite3.Connection): The SQLite database connection.
Returns:
bool: True if the jid_map table exists, False otherwise.
"""
cursor = db.cursor()
cursor.execute("SELECT name FROM sqlite_master WHERE type='table' AND name='jid_map'")
return cursor.fetchone()is not None
def get_jid_map_join(jid_map_exists: bool) -> str:
"""
Returns the SQL JOIN statements for jid_map table.
"""
if not jid_map_exists:
return ""
else:
return """LEFT JOIN jid_map as jid_map_global
ON chat.jid_row_id = jid_map_global.lid_row_id
LEFT JOIN jid lid_global
ON jid_map_global.jid_row_id = lid_global._id
LEFT JOIN jid_map as jid_map_group
ON message.sender_jid_row_id = jid_map_group.lid_row_id
LEFT JOIN jid lid_group
ON jid_map_group.jid_row_id = lid_group._id"""
def get_jid_map_selection(jid_map_exists: bool) -> tuple:
"""
Returns the SQL selection statements for jid_map table.
"""
if not jid_map_exists:
return "jid_global.raw_string", "jid_group.raw_string"
else:
return (
"COALESCE(lid_global.raw_string, jid_global.raw_string)",
"COALESCE(lid_group.raw_string, jid_group.raw_string)"
)
def get_transcription_selection(db: sqlite3.Connection) -> str:
"""
Returns the SQL selection statement for transcription text based on the database schema.
Args:
db (sqlite3.Connection): The SQLite database connection.
Returns:
str: The SQL selection statement for transcription.
"""
cursor = db.cursor()
cursor.execute("PRAGMA table_info(message_media)")
columns = [row[1] for row in cursor.fetchall()]
if "raw_transcription_text" in columns:
return "message_media.raw_transcription_text AS transcription_text"
else:
return "NULL AS transcription_text"
def setup_template(template: Optional[str], no_avatar: bool, experimental: bool = False) -> jinja2.Template: def setup_template(template: Optional[str], no_avatar: bool, experimental: bool = False) -> jinja2.Template:
""" """
Sets up the Jinja2 template environment and loads the template. Sets up the Jinja2 template environment and loads the template.
@@ -639,11 +701,17 @@ def get_from_string(msg: Dict, chat_id: str) -> str:
def get_chat_type(chat_id: str) -> str: def get_chat_type(chat_id: str) -> str:
"""Return the chat type based on the whatsapp id""" """Return the chat type based on the whatsapp id"""
if chat_id.endswith("@s.whatsapp.net"): if chat_id == "000000000000000":
return "calls"
elif chat_id.endswith("@s.whatsapp.net"):
return "personal_chat" return "personal_chat"
if chat_id.endswith("@g.us"): elif chat_id.endswith("@g.us"):
return "private_group" return "private_group"
logger.warning("Unknown chat type for %s, defaulting to private_group", chat_id) elif chat_id == "status@broadcast":
return "status_broadcast"
elif chat_id.endswith("@broadcast"):
return "broadcast_channel"
logger.warning(f"Unknown chat type for {chat_id}, defaulting to private_group{CLEAR_LINE}")
return "private_group" return "private_group"
@@ -674,34 +742,35 @@ def telegram_json_format(jik: str, data: Dict, timezone_offset) -> Dict:
except ValueError: except ValueError:
# not a real chat: e.g. statusbroadcast # not a real chat: e.g. statusbroadcast
chat_id = 0 chat_id = 0
obj = { json_obj = {
"name": data["name"] if data["name"] else jik, "name": data["name"] if data["name"] else jik,
"type": get_chat_type(jik), "type": get_chat_type(jik),
"id": chat_id, "id": chat_id,
"messages": [ { "messages": [ {
"id": int(msgId), "id": int(msgId),
"type": "message", "type": "message",
"date": timing.format_timestamp(msg["timestamp"], "%Y-%m-%dT%H:%M:%S"), "date": timing.format_timestamp(msg["timestamp"], "%Y-%m-%dT%H:%M:%S"),
"date_unixtime": int(msg["timestamp"]), "date_unixtime": int(msg["timestamp"]),
"from": get_from_string(msg, chat_id), "from": get_from_string(msg, chat_id),
"from_id": get_from_id(msg, chat_id), "from_id": get_from_id(msg, chat_id),
"reply_to_message_id": get_reply_id(data, msg["reply"]), "reply_to_message_id": get_reply_id(data, msg["reply"]),
"text": msg["data"], "text": msg["data"],
"text_entities": [ "text_entities": [
{ {
# TODO this will lose formatting and different types # TODO this will lose formatting and different types
"type": "plain", "type": "plain",
"text": msg["data"], "text": msg["data"],
} }
], ],
} for msgId, msg in data["messages"].items()]
} }
for msgId, msg in data["messages"].items()]
}
# remove empty messages and replies # remove empty messages and replies
for msg_id, msg in enumerate(obj["messages"]): for msg_id, msg in enumerate(json_obj["messages"]):
if not msg["reply_to_message_id"]: if not msg["reply_to_message_id"]:
del obj["messages"][msg_id]["reply_to_message_id"] del json_obj["messages"][msg_id]["reply_to_message_id"]
obj["messages"] = [m for m in obj["messages"] if m["text"]] json_obj["messages"] = [m for m in json_obj["messages"] if m["text"]]
return obj return json_obj
class WhatsAppIdentifier(StrEnum): class WhatsAppIdentifier(StrEnum):

View File

@@ -127,6 +127,125 @@
--tw-translate-x: -50%; --tw-translate-x: -50%;
transform: translate(var(--tw-translate-x), var(--tw-translate-y)) rotate(var(--tw-rotate)) skewX(var(--tw-skew-x)) skewY(var(--tw-skew-y)) scaleX(var(--tw-scale-x)) scaleY(var(--tw-scale-y)); transform: translate(var(--tw-translate-x), var(--tw-translate-y)) rotate(var(--tw-rotate)) skewX(var(--tw-skew-x)) skewY(var(--tw-skew-y)) scaleX(var(--tw-scale-x)) scaleY(var(--tw-scale-y));
} }
.status-indicator {
display: inline-block;
margin-left: 4px;
font-size: 0.8em;
color: #8c8c8c;
}
.status-indicator.read {
color: #34B7F1;
}
.play-icon {
width: 0;
height: 0;
border-left: 8px solid white;
border-top: 5px solid transparent;
border-bottom: 5px solid transparent;
filter: drop-shadow(0 1px 2px rgba(0, 0, 0, 0.3));
}
.speaker-icon {
position: relative;
width: 8px;
height: 6px;
background: #666;
border-radius: 1px 0 0 1px;
}
.speaker-icon::before {
content: '';
position: absolute;
right: -4px;
top: -1px;
width: 0;
height: 0;
border-left: 4px solid #666;
border-top: 4px solid transparent;
border-bottom: 4px solid transparent;
}
.speaker-icon::after {
content: '';
position: absolute;
right: -8px;
top: -3px;
width: 8px;
height: 12px;
border: 2px solid #666;
border-left: none;
border-radius: 0 8px 8px 0;
}
.search-icon {
width: 20px;
height: 20px;
position: relative;
display: inline-block;
}
.search-icon::before {
content: '';
position: absolute;
width: 12px;
height: 12px;
border: 2px solid #aebac1;
border-radius: 50%;
top: 2px;
left: 2px;
}
.search-icon::after {
content: '';
position: absolute;
width: 2px;
height: 6px;
background: #aebac1;
transform: rotate(45deg);
top: 12px;
left: 12px;
}
.arrow-left {
width: 0;
height: 0;
border-top: 6px solid transparent;
border-bottom: 6px solid transparent;
border-right: 8px solid #aebac1;
display: inline-block;
}
.arrow-right {
width: 0;
height: 0;
border-top: 6px solid transparent;
border-bottom: 6px solid transparent;
border-left: 8px solid #aebac1;
display: inline-block;
}
.info-icon {
width: 20px;
height: 20px;
border: 2px solid currentColor;
border-radius: 50%;
position: relative;
display: inline-block;
}
.info-icon::before {
content: 'i';
position: absolute;
top: 50%;
left: 50%;
transform: translate(-50%, -50%);
font-size: 12px;
font-weight: bold;
font-style: normal;
}
</style> </style>
<script> <script>
function search(event) { function search(event) {
@@ -163,34 +282,24 @@
</div> </div>
<div class="flex space-x-4"> <div class="flex space-x-4">
<!-- <button id="searchButton"> <!-- <button id="searchButton">
<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5 text-[#aebac1]" fill="none" viewBox="0 0 24 24" stroke="currentColor"> <span class="search-icon"></span>
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M21 21l-6-6m2-5a7 7 0 11-14 0 7 7 0 0114 0z" /> </button> -->
</svg> <!-- <span class="arrow-left"></span> -->
</button> -->
<!-- <svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5 text-[#aebac1]" fill="none" viewBox="0 0 24 24" stroke="currentColor">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M15 19l-7-7 7-7" />
</svg> -->
{% if previous %} {% if previous %}
<a href="./{{ previous }}" target="_self"> <a href="./{{ previous }}" target="_self">
<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5 text-[#aebac1]" fill="none" viewBox="0 0 24 24" stroke="currentColor"> <span class="arrow-left"></span>
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M15 5l-7 7 7 7" />
</svg>
</a> </a>
{% endif %} {% endif %}
{% if next %} {% if next %}
<a href="./{{ next }}" target="_self"> <a href="./{{ next }}" target="_self">
<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5 text-[#aebac1]" fill="none" viewBox="0 0 24 24" stroke="currentColor"> <span class="arrow-right"></span>
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M9 5l7 7-7 7" />
</svg>
</a> </a>
{% endif %} {% endif %}
</div> </div>
<!-- Search Input Overlay --> <!-- Search Input Overlay -->
<div id="mainSearchInput" class="search-input absolute article top-0 bg-whatsapp-dark p-3 flex items-center space-x-3"> <div id="mainSearchInput" class="search-input absolute article top-0 bg-whatsapp-dark p-3 flex items-center space-x-3">
<button id="closeMainSearch" class="text-[#aebac1]"> <button id="closeMainSearch" class="text-[#aebac1]">
<svg xmlns="http://www.w3.org/2000/svg" class="h-6 w-6" fill="none" viewBox="0 0 24 24" stroke="currentColor"> <span class="arrow-left"></span>
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M15 19l-7-7 7-7" />
</svg>
</button> </button>
<input type="text" placeholder="Search..." class="flex-1 bg-[#1f2c34] text-white rounded-lg px-3 py-1 focus:outline-none" id="mainHeaderSearchInput" onkeyup="search(event)"> <input type="text" placeholder="Search..." class="flex-1 bg-[#1f2c34] text-white rounded-lg px-3 py-1 focus:outline-none" id="mainHeaderSearchInput" onkeyup="search(event)">
</div> </div>
@@ -230,18 +339,44 @@
</div> </div>
</div> </div>
</div> </div>
<div class="bg-whatsapp-light rounded-lg p-2 max-w-[80%] shadow-sm"> <div class="bg-whatsapp-light rounded-lg p-2 max-w-[80%] shadow-sm relative {% if msg.reactions %}mb-2{% endif %}">
{% if msg.reply is not none %} {% if msg.reply is not none %}
<a href="#{{msg.reply}}" target="_self" class="no-base"> <a href="#{{msg.reply}}" target="_self" class="no-base">
<div class="mb-2 p-1 bg-whatsapp-chat-light rounded border-l-4 border-whatsapp text-sm reply-box"> <div
<p class="text-whatsapp font-medium text-xs">Replying to</p> class="mb-2 p-1 bg-whatsapp-chat-light rounded border-l-4 border-whatsapp text-sm reply-box">
<p class="text-[#111b21] text-xs truncate"> <div class="flex items-center gap-2">
{% if msg.quoted_data is not none %} <div class="flex-1 overflow-hidden">
"{{msg.quoted_data}}" <p class="text-whatsapp font-medium text-xs">Replying to</p>
{% else %} <p class="text-[#111b21] text-xs truncate">
this message {% if msg.quoted_data is not none %}
"{{msg.quoted_data}}"
{% else %}
this message
{% endif %}
</p>
</div>
{% set replied_msg = msgs | selectattr('key_id', 'equalto', msg.reply) | first %}
{% if replied_msg and replied_msg.media == true %}
<div class="flex-shrink-0">
{% if "image/" in replied_msg.mime %}
<img src="{{ replied_msg.thumb if replied_msg.thumb is not none else replied_msg.data }}"
class="w-8 h-8 rounded object-cover" loading="lazy" />
{% elif "video/" in replied_msg.mime %}
<div class="relative w-8 h-8 rounded overflow-hidden bg-gray-200">
<img src="{{ replied_msg.thumb if replied_msg.thumb is not none else replied_msg.data }}"
class="w-full h-full object-cover" loading="lazy" />
<div class="absolute inset-0 flex items-center justify-center">
<div class="play-icon"></div>
</div>
</div>
{% elif "audio/" in replied_msg.mime %}
<div class="w-8 h-8 rounded bg-gray-200 flex items-center justify-center">
<div class="speaker-icon"></div>
</div>
{% endif %}
</div>
{% endif %} {% endif %}
</p> </div>
</div> </div>
</a> </a>
{% endif %} {% endif %}
@@ -281,28 +416,73 @@
{% filter escape %}{{ msg.data }}{% endfilter %} {% filter escape %}{{ msg.data }}{% endfilter %}
{% endif %} {% endif %}
{% if msg.caption is not none %} {% if msg.caption is not none %}
{{ msg.caption | urlize(none, true, '_blank') }} <p class='mt-1 {% if "audio/" in msg.mime %}text-[#808080]{% endif %}'>
{{ msg.caption | urlize(none, true, '_blank') }}
</p>
{% endif %} {% endif %}
{% endif %} {% endif %}
{% endif %} {% endif %}
</p> </p>
<p class="text-[10px] text-[#667781] text-right mt-1">{{ msg.time }}</p> <p class="text-[10px] text-[#667781] text-right mt-1">{{ msg.time }}
<span class="status-indicator{% if msg.read_timestamp %} read{% endif %}">
{% if msg.received_timestamp %}
✓✓
{% else %}
{% endif %}
</span>
</p>
{% if msg.reactions %}
<div class="flex flex-wrap gap-1 mt-1 justify-end absolute -bottom-3 -right-2">
{% for sender, emoji in msg.reactions.items() %}
<div class="bg-white rounded-full px-1.5 py-0.5 text-xs shadow-sm border border-gray-200 cursor-help" title="{{ sender }}">
{{ emoji }}
</div>
{% endfor %}
</div>
{% endif %}
</div> </div>
</div> </div>
{% else %} {% else %}
<div class="flex justify-start items-center group" id="{{ msg.key_id }}"> <div class="flex justify-start items-center group" id="{{ msg.key_id }}">
<div class="bg-white rounded-lg p-2 max-w-[80%] shadow-sm"> <div class="bg-white rounded-lg p-2 max-w-[80%] shadow-sm relative {% if msg.reactions %}mb-2{% endif %}">
{% if msg.reply is not none %} {% if msg.reply is not none %}
<a href="#{{msg.reply}}" target="_self" class="no-base"> <a href="#{{msg.reply}}" target="_self" class="no-base">
<div class="mb-2 p-1 bg-whatsapp-chat-light rounded border-l-4 border-whatsapp text-sm reply-box"> <div
<p class="text-whatsapp font-medium text-xs">Replying to</p> class="mb-2 p-1 bg-whatsapp-chat-light rounded border-l-4 border-whatsapp text-sm reply-box">
<p class="text-[#808080] text-xs truncate"> <div class="flex items-center gap-2">
{% if msg.quoted_data is not none %} <div class="flex-1 overflow-hidden">
{{msg.quoted_data}} <p class="text-whatsapp font-medium text-xs">Replying to</p>
{% else %} <p class="text-[#808080] text-xs truncate">
this message {% if msg.quoted_data is not none %}
{{msg.quoted_data}}
{% else %}
this message
{% endif %}
</p>
</div>
{% set replied_msg = msgs | selectattr('key_id', 'equalto', msg.reply) | first %}
{% if replied_msg and replied_msg.media == true %}
<div class="flex-shrink-0">
{% if "image/" in replied_msg.mime %}
<img src="{{ replied_msg.thumb if replied_msg.thumb is not none else replied_msg.data }}"
class="w-8 h-8 rounded object-cover" loading="lazy" />
{% elif "video/" in replied_msg.mime %}
<div class="relative w-8 h-8 rounded overflow-hidden bg-gray-200">
<img src="{{ replied_msg.thumb if replied_msg.thumb is not none else replied_msg.data }}"
class="w-full h-full object-cover" loading="lazy" />
<div class="absolute inset-0 flex items-center justify-center">
<div class="play-icon"></div>
</div>
</div>
{% elif "audio/" in replied_msg.mime %}
<div class="w-8 h-8 rounded bg-gray-200 flex items-center justify-center">
<div class="speaker-icon"></div>
</div>
{% endif %}
</div>
{% endif %} {% endif %}
</p> </div>
</div> </div>
</a> </a>
{% endif %} {% endif %}
@@ -342,7 +522,9 @@
{% filter escape %}{{ msg.data }}{% endfilter %} {% filter escape %}{{ msg.data }}{% endfilter %}
{% endif %} {% endif %}
{% if msg.caption is not none %} {% if msg.caption is not none %}
{{ msg.caption | urlize(none, true, '_blank') }} <p class='mt-1 {% if "audio/" in msg.mime %}text-[#808080]{% endif %}'>
{{ msg.caption | urlize(none, true, '_blank') }}
</p>
{% endif %} {% endif %}
{% endif %} {% endif %}
{% endif %} {% endif %}
@@ -356,6 +538,15 @@
<span class="flex-grow min-w-[4px]"></span> <span class="flex-grow min-w-[4px]"></span>
<span class="flex-shrink-0">{{ msg.time }}</span> <span class="flex-shrink-0">{{ msg.time }}</span>
</div> </div>
{% if msg.reactions %}
<div class="flex flex-wrap gap-1 mt-1 justify-start absolute -bottom-3 -left-2">
{% for sender, emoji in msg.reactions.items() %}
<div class="bg-gray-100 rounded-full px-1.5 py-0.5 text-xs shadow-sm border border-gray-200 cursor-help" title="{{ sender }}">
{{ emoji }}
</div>
{% endfor %}
</div>
{% endif %}
</div> </div>
<!-- <div class="opacity-0 group-hover:opacity-100 transition-opacity duration-200 relative ml-2"> <!-- <div class="opacity-0 group-hover:opacity-100 transition-opacity duration-200 relative ml-2">
<div class="relative"> <div class="relative">
@@ -377,20 +568,19 @@
{% endfor %} {% endfor %}
</div> </div>
<footer> <footer>
<h2 class="text-center">
{% if not next %} {% if not next %}
End of History <div class="flex justify-center mb-6">
<div class="bg-[#e1f2fb] rounded-lg px-3 py-2 text-sm text-[#54656f]">
End of History
</div>
</div>
{% endif %} {% endif %}
</h2>
<br> <br>
Portions of this page are reproduced from <a href="https://web.dev/articles/lazy-loading-video">work</a> created and <a href="https://developers.google.com/readme/policies">shared by Google</a> and used according to terms described in the <a href="https://www.apache.org/licenses/LICENSE-2.0">Apache 2.0 License</a>. Portions of this page are reproduced from <a href="https://web.dev/articles/lazy-loading-video">work</a>
created and <a href="https://developers.google.com/readme/policies">shared by Google</a> and used
according to terms described in the <a href="https://www.apache.org/licenses/LICENSE-2.0">Apache 2.0
License</a>.
</footer> </footer>
<svg style="display: none;">
<!-- Tooltip info icon -->
<symbol id="info-icon" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M13 16h-1v-4h-1m1-4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
</symbol>
</svg>
</div> </div>
</article> </article>
</body> </body>

View File

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project] [project]
name = "whatsapp-chat-exporter" name = "whatsapp-chat-exporter"
version = "0.13.0rc1" version = "0.13.0rc2"
description = "A Whatsapp database parser that provides history of your Whatsapp conversations in HTML and JSON. Android, iOS, iPadOS, Crypt12, Crypt14, Crypt15 supported." description = "A Whatsapp database parser that provides history of your Whatsapp conversations in HTML and JSON. Android, iOS, iPadOS, Crypt12, Crypt14, Crypt15 supported."
readme = "README.md" readme = "README.md"
authors = [ authors = [
@@ -36,17 +36,19 @@ classifiers = [
requires-python = ">=3.10" requires-python = ">=3.10"
dependencies = [ dependencies = [
"jinja2", "jinja2",
"bleach" "bleach",
"tqdm"
] ]
[project.optional-dependencies] [project.optional-dependencies]
android_backup = ["pycryptodome", "javaobj-py3"] android_backup = ["pycryptodome", "javaobj-py3"]
ios_backup = ["iphone_backup_decrypt @ git+https://github.com/KnugiHK/iphone_backup_decrypt"]
crypt12 = ["pycryptodome"] crypt12 = ["pycryptodome"]
crypt14 = ["pycryptodome"] crypt14 = ["pycryptodome"]
crypt15 = ["pycryptodome", "javaobj-py3"] crypt15 = ["pycryptodome", "javaobj-py3"]
all = ["pycryptodome", "javaobj-py3"] all = ["pycryptodome", "javaobj-py3", "iphone_backup_decrypt @ git+https://github.com/KnugiHK/iphone_backup_decrypt"]
everything = ["pycryptodome", "javaobj-py3"] everything = ["pycryptodome", "javaobj-py3", "iphone_backup_decrypt @ git+https://github.com/KnugiHK/iphone_backup_decrypt"]
backup = ["pycryptodome", "javaobj-py3"] backup = ["pycryptodome", "javaobj-py3", "iphone_backup_decrypt @ git+https://github.com/KnugiHK/iphone_backup_decrypt"]
[project.scripts] [project.scripts]
wtsexporter = "Whatsapp_Chat_Exporter.__main__:main" wtsexporter = "Whatsapp_Chat_Exporter.__main__:main"

27
tests/conftest.py Normal file
View File

@@ -0,0 +1,27 @@
import pytest
import os
def pytest_collection_modifyitems(config, items):
"""
Moves test_nuitka_binary.py to the end and fails if the file is missing.
"""
target_file = "test_nuitka_binary.py"
# Sanity Check: Ensure the file actually exists in the tests directory
test_dir = os.path.join(config.rootdir, "tests")
file_path = os.path.join(test_dir, target_file)
if not os.path.exists(file_path):
pytest.exit(f"\n[FATAL] Required test file '{target_file}' not found in {test_dir}. "
f"Order enforcement failed!", returncode=1)
nuitka_tests = []
remaining_tests = []
for item in items:
if target_file in item.nodeid:
nuitka_tests.append(item)
else:
remaining_tests.append(item)
items[:] = remaining_tests + nuitka_tests

View File

@@ -101,6 +101,7 @@ chat_data_merged = {
"mime": None, "mime": None,
"reply": None, "reply": None,
"quoted_data": None, "quoted_data": None,
'reactions': {},
"caption": None, "caption": None,
"thumb": None, "thumb": None,
"sticker": False, "sticker": False,
@@ -121,6 +122,7 @@ chat_data_merged = {
"mime": None, "mime": None,
"reply": None, "reply": None,
"quoted_data": None, "quoted_data": None,
'reactions': {},
"caption": None, "caption": None,
"thumb": None, "thumb": None,
"sticker": False, "sticker": False,
@@ -141,6 +143,7 @@ chat_data_merged = {
"mime": None, "mime": None,
"reply": None, "reply": None,
"quoted_data": None, "quoted_data": None,
'reactions': {},
"caption": None, "caption": None,
"thumb": None, "thumb": None,
"sticker": False, "sticker": False,