221 Commits

Author SHA1 Message Date
KnugiHK
bac2efe15a Revert "Update README.md"
This reverts commit 1c7d6f7912.
2026-01-24 18:33:10 +08:00
KnugiHK
9a6ee3ce5f Revert "Add iphone_backup_decrypt as an optional dependency (#123)"
This reverts commit 94960e4a23.
2026-01-24 18:31:59 +08:00
KnugiHK
823a89e677 Merge branch 'dev' 2026-01-24 18:21:36 +08:00
KnugiHK
945b422f71 Update ci.yml 2026-01-24 18:21:25 +08:00
KnugiHK
19008a80bc Merge branch 'dev' 2026-01-24 18:09:15 +08:00
KnugiHK
4e877987fb Bump version & update readme 2026-01-24 18:08:43 +08:00
KnugiHK
322b12a5a4 Fix a crash in message counting if chat filter is in use 2026-01-24 18:02:30 +08:00
KnugiHK
1560c49644 Update ci.yml 2026-01-24 17:42:02 +08:00
KnugiHK
28ba97d72f Fix CI on Windows 2026-01-24 17:38:22 +08:00
KnugiHK
eab98ba0d6 Fix crash on pre-release versions and enable update checks for pre-releases 2026-01-24 17:20:07 +08:00
KnugiHK
f920ca82b4 Refactor the logging facility a bit 2026-01-24 17:05:14 +08:00
KnugiHK
4eed3ca321 Refactor CLEAR_LINE in a more pythonic way
So it is easier for contributor to write a logging line for this project.
2026-01-24 16:48:07 +08:00
KnugiHK
746e4e1ac5 Fix and improve the logging facility for incremental merge 2026-01-24 16:24:10 +08:00
KnugiHK
1694ae7dd9 Update utility.py 2026-01-24 01:47:45 +08:00
KnugiHK
f05e0d3451 Refactor incremental_merge 2026-01-24 01:33:18 +08:00
KnugiHK
0c5f2b7f13 Add a comment on SQLi in get_chat_condition 2026-01-24 01:19:55 +08:00
KnugiHK
db01d05263 Refactor get_chat_condition to increase maintainability 2026-01-24 00:50:06 +08:00
KnugiHK
2e7953f4ca Add unit test for get_chat_condition 2026-01-24 00:03:21 +08:00
KnugiHK
95a52231be Fix the returning string for empty filter list 2026-01-24 00:03:08 +08:00
Knugi
e0aab06192 Update LICENSE 2026-01-21 16:06:12 +00:00
Knugi
43b00d8b48 Update README.md 2026-01-21 14:28:41 +00:00
KnugiHK
bf230db595 Gracefully handle bytes that can't be decoded from db (#44) 2026-01-20 23:35:05 +08:00
KnugiHK
242e8ee43a Fix regressions introduced in 194ed29 (default template swap)
This commit restores the logic originally introduced in:

* 265afc1
* 8cf1071
* 177b936
2026-01-20 01:42:30 +08:00
lifnej
c32096b26b Show sql errors if DEBUG flag is set. 2026-01-20 00:07:04 +08:00
lifnej
4aa1c26232 Missing newline in vcard info log. 2026-01-20 00:06:38 +08:00
KnugiHK
feca9ae8e0 Fix error on database without jid_map table
I realized the `jid_map` table might be missing after reviewing @lifnej's work in ee7db80. This fix adds use the preflight check result for the table before querying it.

I plan to apply this same pattern to other sections where `jid_map` is used.
2026-01-19 22:59:19 +08:00
KnugiHK
92c325294c Add preflight check to see if the jid_map table exists 2026-01-19 22:53:29 +08:00
KnugiHK
7dbd0dbe3c Add preflight check to see if transciption column exists 2026-01-19 22:46:30 +08:00
KnugiHK
035e61c4d7 Fix incremental merge CI 2026-01-19 21:31:23 +08:00
KnugiHK
96d323e0ed Fetch sender_timestamp for future use
WhatsApp doesn't show when a reaction was made, and I don't want to mess with a popup in the HTML yet. Let’s just fetch the data for now. It might come in handy later.

Credit to @tlcameron3 from #79
2026-01-19 21:28:50 +08:00
Knugi
35ad2559d7 Merge pull request #193 from m1ndy/feature/export-reactions
feat: Add support for exporting message reactions
2026-01-19 20:53:18 +08:00
KnugiHK
8058ed8219 Add tqdm progress bar 2026-01-19 20:49:14 +08:00
KnugiHK
908d8f71ca Fix merge conflict error 2026-01-19 20:41:45 +08:00
Knugi
f2b6a39011 Merge branch 'dev' into feature/export-reactions 2026-01-19 20:38:20 +08:00
KnugiHK
4f531ec52a Reverting the __version__ handle
See my comment at https://github.com/KnugiHK/WhatsApp-Chat-Exporter/pull/193/changes
2026-01-19 20:36:18 +08:00
KnugiHK
b69f645ac3 Adopt the same lid mapping to all sql query
Because the chat filter needs it
2026-01-19 20:29:56 +08:00
KnugiHK
f8b959e1e1 Implement an on-the-fly fix of dot-ending files (#185) 2026-01-18 23:03:49 +08:00
KnugiHK
9be210f34a Implement voice message transcription for Android (#159) 2026-01-18 21:59:03 +08:00
KnugiHK
ae7ba3da96 action_type 58 is actually shared with unblocking 2026-01-18 21:53:36 +08:00
KnugiHK
00e58ce2c9 Handle group message sender lid mapping (#188) 2026-01-18 21:25:40 +08:00
KnugiHK
4245ecc615 Update android_handler.py 2026-01-17 15:07:16 +08:00
KnugiHK
68dcc6abe0 Improve brute-force offsets with process pool
Refactored the brute-force offset search in `_decrypt_crypt14` to use `ProcessPoolExecutor` for better parallelism and performance. Improved progress reporting and clean shutdown on success or interruption.
2026-01-17 14:43:51 +08:00
KnugiHK
c05e76569b Add more chat type 2026-01-17 13:55:16 +08:00
KnugiHK
a6fe0d93b1 Rename the obj variable to json_obj in telegram_json_format 2026-01-17 13:54:56 +08:00
KnugiHK
2d096eff4d Add tqdm as dependency 2026-01-17 13:45:39 +08:00
KnugiHK
ea9675973c Refactor Message class to accept pre-initialized Timing object
Pass the `Timing` object directly through `timezone_offset` to avoid repeated initialization of the same object within the `Message` class.
2026-01-17 13:42:11 +08:00
KnugiHK
064b923cfa Convert time unit for progress 2026-01-17 13:22:56 +08:00
KnugiHK
cd35ffc185 Remove the prompt after user enter the password 2026-01-17 13:19:10 +08:00
KnugiHK
05bd26b8ed Decrease the default brute force worker to 4 2026-01-17 13:18:49 +08:00
KnugiHK
d200130335 Refactor to use tqdm for showing progress 2026-01-17 13:18:31 +08:00
KnugiHK
1c7d6f7912 Update README.md 2026-01-14 02:10:05 +08:00
KnugiHK
94960e4a23 Add iphone_backup_decrypt as an optional dependency (#123)
to make managing dependency easier
2026-01-14 02:07:10 +08:00
KnugiHK
79578d867f Handle new LID mapping #188, #144, #168
Implements the latest LID mapping changes. This should fully addresses #188 and likely resolves #144 (validation required). Note: A successful fix for #144 deprecates the pending workaround in #168. Additionally, resolved a bug where chat filters were not working for  newly created chat rooms.
2026-01-13 01:52:58 +08:00
Knugi
32c93159ac Update ci.yml 2026-01-12 15:50:19 +00:00
KnugiHK
6910cc46a4 Update android_handler.py 2026-01-12 22:55:51 +08:00
KnugiHK
9e0457e720 Adjust the reaction to be render on the bottom left/right corner
This makes the reaction match WhatsApp's theme.
2026-01-12 22:54:05 +08:00
KnugiHK
e0967a3104 Defer reaction logging until table existence is confirmed
Moved the "Processing reactions..." log entry to occur after the `message_add_on` table check. This prevents the log from appearing on the old WhatsApp schema
2026-01-12 22:23:16 +08:00
KnugiHK
db50f24dd8 Minor formats 2026-01-12 22:19:59 +08:00
Cosmo
75fcf33fda feat: Add support for exporting message reactions 2026-01-11 07:06:23 -08:00
KnugiHK
0ba81e0863 Implement granular error handling
Added and improved layered Zlib and SQLite header checks to distinguish between authentication failures (wrong key) and data corruption.
2026-01-08 23:59:31 +08:00
KnugiHK
647e406ac0 Implement early key validation via authenticated decryption (#190)
Utilize `decrypt_and_verify` to immediately identify incorrect user-provided keys via GCM tag validation.
2026-01-08 23:57:02 +08:00
KnugiHK
9cedcf1767 Create conftest to oves test_nuitka_binary.py to the end of testing
Moves test_nuitka_binary.py to the end and fails if the file is missing.
2026-01-06 23:00:36 +08:00
KnugiHK
93a020f68d Merge branch 'dev' 2026-01-06 21:19:22 +08:00
KnugiHK
401abfb732 Bump version 2026-01-06 21:19:09 +08:00
KnugiHK
3538c81605 Enhance qouted message resolution to include media caption
Modified the `reply_query` to support messages that may not have body text but contain media caption.
2026-01-06 20:59:51 +08:00
KnugiHK
5a20953a81 Optimize quoted message lookups via global in-memory mapping
This change replaces the inefficient N+1 SQL query pattern with a pre-computed hash map. By fetching `ZSTANZAID` and `ZTEXT` pairs globally before processing, the exporter can resolve quoted message content in O(1) time.

Crucially, this maintains parity with the Android exporter by ensuring that replies to messages outside the current date or chat filters are still correctly rendered, providing full conversational context without the performance penalty of repeated database hits.
2026-01-06 20:51:29 +08:00
KnugiHK
8f29fa0505 Center the version string in the exporter banner 2026-01-06 20:35:02 +08:00
KnugiHK
0a14da9108 Reduce CI platforms 2026-01-05 00:31:47 +08:00
KnugiHK
929534ff80 Add windows 11 arm and macos 15 intel to CI 2026-01-05 00:17:00 +08:00
KnugiHK
87c1555f03 Add windows 11 arm and macos x64 to binary compiling 2026-01-05 00:02:52 +08:00
Knugi
fd325b6b59 Update generate-website.yml 2026-01-04 05:51:15 +00:00
Knugi
17e927ffd6 Update README.md 2026-01-02 05:35:59 +00:00
Knugi
5b488359c8 Update README.md 2026-01-02 05:32:39 +00:00
Knugi
d2186447c6 Update README.md 2026-01-02 05:30:22 +00:00
Knugi
82abf7d874 Add Verifying Build Integrity section 2026-01-02 04:53:52 +00:00
Knugi
5e676f2663 Merge pull request #187 from KnugiHK/alert-autofix-4
Potential fix for code scanning alert no. 4: Workflow does not contain permissions
2026-01-02 12:39:56 +08:00
Knugi
5da2772112 Potential fix for code scanning alert no. 4: Workflow does not contain permissions
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2026-01-02 04:39:37 +00:00
KnugiHK
04a21728a8 Merge branch 'dev' 2026-01-01 15:08:52 +08:00
KnugiHK
412efd66a0 Add --tg as an alias to --telegram 2026-01-01 15:06:21 +08:00
KnugiHK
0ac1612c6c Django license is no longer needed 2026-01-01 14:56:10 +08:00
KnugiHK
8ffeabfca6 Bump version 2026-01-01 14:02:33 +08:00
KnugiHK
d5ad085210 Update pm.png 2025-12-29 01:40:25 +08:00
KnugiHK
baaafe1eca Update README.md 2025-12-29 00:32:07 +08:00
KnugiHK
91f160fc2a Update example image 2025-12-29 00:31:19 +08:00
Knugi
21cae9fe93 Add Python Support Policy 2025-12-28 10:32:34 +00:00
KnugiHK
a70895f959 Drop Python 3.9 2025-12-28 18:28:10 +08:00
Knugi
79d12b9c8b Fix a typo #184 2025-12-28 06:23:33 +00:00
KnugiHK
ff27918705 Update pyproject.toml 2025-12-27 19:01:40 +08:00
KnugiHK
a1c53c3db2 Update test_nuitka_binary.py 2025-12-27 17:32:28 +08:00
KnugiHK
173eb5d02e Python 3.14 is not yet supported for Nuitka 2025-12-27 17:26:24 +08:00
KnugiHK
b39aae365a Update test_nuitka_binary.py 2025-12-27 17:26:03 +08:00
KnugiHK
10691b954a Update test matrix 2025-12-27 17:15:48 +08:00
KnugiHK
60c421a7d0 ok... large image is not free... 2025-12-27 17:05:21 +08:00
KnugiHK
60ddcc08ed Revert "Update ci.yml"
This reverts commit 02b770a6f4.
2025-12-27 17:05:06 +08:00
Knugi
02b770a6f4 Update ci.yml 2025-12-27 09:03:27 +00:00
KnugiHK
5e1bca53d1 Correct macOS binary architecture naming and add x64 build for macos 2025-12-27 16:58:35 +08:00
KnugiHK
968447fef9 Use powershell native function on Windows 2025-12-27 16:55:48 +08:00
KnugiHK
506442392c Add artifact attestation 2025-12-27 16:53:45 +08:00
KnugiHK
1c2d3acf1b Remove vobject from building CICD 2025-12-27 16:49:58 +08:00
KnugiHK
aef568b80b Merge branch 'main' into dev 2025-12-27 16:48:47 +08:00
Knugi
42e583ac7c Merge pull request #175 from tomballgithub/vcard_fix
Fix vcard decoding errors
2025-12-15 23:00:07 +08:00
Knugi
ea60f878be Upgrade CodeQL action versions to v4 2025-12-15 14:53:39 +00:00
KnugiHK
9d2e06f973 Merge branch 'main' of https://github.com/KnugiHK/Whatsapp-Chat-Exporter 2025-12-15 01:12:30 +08:00
KnugiHK
dffce977de Bump version to 0.12.1 2025-12-15 01:12:14 +08:00
KnugiHK
71ca293557 Add main entry point
Added a main entry point in __main__.py to allow running the exporter as a script. Required for standalone binary
2025-12-15 01:12:04 +08:00
Knugi
75720c6d0a Upgrade GitHub Actions to use version 6 2025-12-14 17:08:49 +00:00
KnugiHK
5a80fe189d Add error handling to quoted-printable decoding
Wrapped the decode_quoted_printable function in a try-except block to handle decoding errors gracefully. If decoding fails, a warning is logged and the original value is returned, improving robustness when processing malformed vCard data.
2025-12-14 23:49:10 +08:00
KnugiHK
bb10203b44 Remove vobject dependency from project and workflow
Eliminated the use of the vobject library from the codebase, dependency groups, and GitHub Actions workflow. vobject is no longer a dependency for vCards enrichment.
2025-12-14 23:47:24 +08:00
KnugiHK
ddd0ac3143 Refactor vCard parsing to improve decoding and structure
Replaces regex-based vCard parsing with dedicated functions for parsing lines, handling quoted-printable encoding, and extracting fields. Adds support for CHARSET and ENCODING parameters, improves handling of multi-line and encoded values, and centralizes vCard entry processing for better maintainability and accuracy.
2025-12-14 23:00:48 +08:00
KnugiHK
43658a92c4 Replace print with logger in read_vcards_file
Changed the contact import message from a print statement to a logger.info call for better logging consistency.
2025-12-14 21:57:17 +08:00
KnugiHK
194ed29a6e Switch the default template to the WhatsApp-alike them
The old telegram theme can still be applied with the `--old-theme` option
2025-12-14 21:40:17 +08:00
Knugi
fa629503f7 Update Nuitka version and build commands in workflow 2025-12-14 09:43:50 +00:00
Knugi
f6442f9d73 Update Nuitka installation in CI workflow
Removed specific version for Nuitka installation.
2025-12-14 09:20:41 +00:00
tomballgithub
02363af637 Updated vcard test to check for failing cases which caused this PR 2025-12-03 22:42:31 -06:00
tomballgithub
8c9c69a536 Print the number of imported vcards 2025-11-29 20:28:51 -06:00
tomballgithub
029700359e Fix vcard decoding errors 2025-11-29 19:34:27 -06:00
KnugiHK
beaf272a63 ignoreUnreadable line in vcard #173
This makes multi line entry in vcard being ignored.
2025-11-26 22:05:42 +08:00
Knugi
1d5bad92a7 Add new IV and DB entry to utility.py
Reported by @silasjelley
2025-11-07 13:13:14 +00:00
Knugi
09162bf522 Update README with usage notes and Android link
Added note about providing link for Android export instructions.
2025-10-20 05:55:09 +00:00
KnugiHK
da4cea6230 Change how contacts are populated from vCards (fix #167)
Enrichment is now performed before message processing to ensure that all contacts are available, regardless of whether they exist in the ChatCollection.
2025-10-12 23:18:55 +08:00
Knugi
2b8af6a2fc Merge pull request #163 from jensb/fix-162-empty-chat-names
Update vcards_contacts.py to handle enrichment of empty chat names (#162)
2025-08-19 22:35:46 +08:00
jensb
f04205cb49 Update vcards_contacts.py to handle enrichment of empty chat names. Fixes #162. 2025-08-17 23:55:32 +02:00
KnugiHK
177b936b25 Give styling to "End of history" 2025-07-27 16:28:28 +08:00
KnugiHK
101e554413 Refactor 2025-07-27 16:25:47 +08:00
KnugiHK
49851f5874 Fix overflow in reply text 2025-07-27 16:14:54 +08:00
KnugiHK
8cf1071c90 Implement media preview in reply bubble #128 2025-07-27 15:58:36 +08:00
Knugi
25fa1cc530 Merge pull request #157 from glemco/telegram_json
Add support for telegram JSON file format
2025-07-02 18:26:52 +08:00
glemco
deebd6c87e Changes after code review 2025-06-29 10:49:01 +02:00
KnugiHK
f623eddc23 Fix incorrect SQL statement
The incorrect SQL statement prevents retrieval of media information.
2025-06-19 23:13:28 +08:00
KnugiHK
5cd8d953ac Add an option to skip processing replies in iOS
Since processing replies take time
2025-06-19 22:10:12 +08:00
KnugiHK
265afc1312 Implement (blue) ticks for message status #146 2025-06-19 22:00:26 +08:00
KnugiHK
9d3e65bd92 Fix error when using not supplying a value (default) to --size 2025-06-19 21:41:03 +08:00
KnugiHK
5aa12482e0 Fix on disappearing reply feature in iOS #154 2025-06-19 21:22:20 +08:00
KnugiHK
716d4af3f3 Fix incorrect type on comparison of exception 2025-06-19 21:09:00 +08:00
KnugiHK
4742ffd858 Handle a permission error on macOS #158
Although this does not fix the issue, when the error occurs, it will provide more information to users
2025-06-19 00:10:31 +08:00
glemco
5ed260b0b7 Add support for telegram JSON file format
Add the --telegram command line argument that, combined with a JSON
output, generates a Telegram compatible JSON file [1].

The JSON is per-chat, so the --telegram argument implies the
--json-per-chat setting.

I took a few shortcuts:
* Contact and Ids are inferred from the chat id or phone numbers
* All text is marked as plain (e.g. no markup or different types)
* Only personal chats and private groups supported
* Private groups are defined if the chat has a name
* Various ids try to match the ones in WA but may require bulk edits

[1] - https://core.telegram.org/import-export

Fixes: https://github.com/KnugiHK/WhatsApp-Chat-Exporter/issues/152
2025-06-16 13:01:33 +02:00
KnugiHK
99213503c4 Fix on incorrect rejection by the regex of the size_str
String like '1. MB' should be accepted
2025-06-01 12:17:21 +08:00
KnugiHK
f89f53cf2d Fix test cases 2025-06-01 12:15:54 +08:00
KnugiHK
0ecfe6c59a Cast numeric string in readable_to_bytes 2025-06-01 12:15:15 +08:00
KnugiHK
706466f63b Enforce a tighter check on the input of size_str 2025-06-01 11:54:24 +08:00
KnugiHK
24653b8753 Fixed integer input for --size not being casted to int #156 2025-06-01 11:53:45 +08:00
KnugiHK
e408c31415 Fix: it is impossible to have 0.1 byte as byte is the smallest unit 2025-05-17 19:26:18 +08:00
KnugiHK
6a0fca3e9d Add more tests for utility 2025-05-17 19:16:57 +08:00
KnugiHK
bbb558713f Replace sanitize_filename with safe_name 2025-05-17 18:24:30 +08:00
KnugiHK
ea6e72bf0b Bug fix on incorrectly striping decimal to integer 2025-05-17 17:46:51 +08:00
KnugiHK
d7ded16239 Reimplement the convert_time_unit function to make it more human-readable 2025-05-17 17:35:30 +08:00
KnugiHK
8c2868a60e Fix on missing return in get_status_location 2025-05-17 16:20:11 +08:00
KnugiHK
a53e5a2b3d Update type hint syntax for Python < 3.10 compatibility 2025-05-17 16:18:16 +08:00
KnugiHK
3f88f7fe08 Replacing slugify with a new function 2025-05-17 16:04:31 +08:00
Knugi
7b66fe2ee2 Update LICENSE.django 2025-05-17 05:40:22 +00:00
Knugi
c70143fb4b Create codeql.yml 2025-05-11 10:26:48 +00:00
Knugi
9c9c4d9ad2 Update README.md 2025-05-11 10:21:37 +00:00
KnugiHK
96e483a6b0 Clean up unused code in bplist.py 2025-05-11 18:16:17 +08:00
KnugiHK
587b743522 Fix logging for decrypting whatsapp database 2025-05-11 18:14:41 +08:00
KnugiHK
33149075d3 autopep8 2025-05-11 18:07:51 +08:00
KnugiHK
cc410b8503 Save the environment by reducing CI targets 2025-05-11 18:01:25 +08:00
KnugiHK
e8acf6da32 Fix key access in f-string for older Python 2025-05-11 17:59:20 +08:00
KnugiHK
667c005a67 Make received_ & read_timestamp optional 2025-05-11 17:49:51 +08:00
KnugiHK
bb48cd381b Fix test case where media_base should never be None 2025-05-11 17:49:33 +08:00
KnugiHK
ae6e8ba7e2 Make to_ & from_json functions dynamic
This is to prevent error like #150 in the future
2025-05-11 17:46:00 +08:00
KnugiHK
1eea5fc5c1 Use the new chat importing method from data_model
This commit also fixes #150
2025-05-11 17:29:24 +08:00
KnugiHK
dd795f3282 Adjust banner position 2025-05-11 17:27:23 +08:00
KnugiHK
75c3999567 Update debug log name 2025-05-11 16:56:19 +08:00
KnugiHK
fa41572753 Change print to logger for better logging in the future
This commit also added --debug and --no-banner options, which will enable debug level of logging and supress the default banner
2025-05-11 16:53:46 +08:00
KnugiHK
0681661660 Update bruteforce_crypt15.py 2025-05-08 00:51:09 +08:00
KnugiHK
907fe4aa91 Update ci.yml 2025-05-07 22:56:52 +08:00
Knugi
4bd3c1d74a Update pull_request_template.md 2025-05-07 14:55:21 +00:00
KnugiHK
80cb868beb Expend all tests to all common systems 2025-05-07 22:49:06 +08:00
KnugiHK
904f44dc12 Update test_nuitka_binary.py 2025-05-07 22:40:28 +08:00
KnugiHK
520f31651c Forgot to install nuitka 2025-05-07 22:31:11 +08:00
KnugiHK
c346199d05 Fix python versions in ci.yml 2025-05-07 22:30:04 +08:00
KnugiHK
3e37bbb021 Create test_nuitka_binary.py 2025-05-07 22:28:48 +08:00
KnugiHK
0bb4f52a26 Add CI 2025-05-07 21:46:19 +08:00
KnugiHK
a3294ead11 Add a basic sanity check for the exporter
The check make sure all modules can be imported and the exporter can at least run without any arguments provided.
2025-05-07 21:45:45 +08:00
KnugiHK
e2b773eac5 Move all tests to single directory 2025-05-07 21:31:29 +08:00
KnugiHK
170a108109 Bug fix on incorrectly normalized number that starts with 0 2025-05-07 21:31:03 +08:00
Knugi
1348ec89f0 Merge pull request #149 from fschuh/main_test
Support for incremental merges of two export folders
2025-05-07 21:18:33 +08:00
fschuh
db42ad123d Fixed unit tests so they no longer fail on Windows 2025-05-05 15:53:13 -04:00
fschuh
dad7666adb Updated to also use shutil for JSON file copying 2025-05-05 12:32:29 -04:00
Knugi
f7d1332a14 Update pull_request_template.md 2025-05-05 09:19:45 +00:00
KnugiHK
a58dd78be8 PEP8 2025-05-05 17:13:43 +08:00
KnugiHK
3220ed2d3f Update testing data 2025-05-05 17:11:55 +08:00
KnugiHK
4e1d994aa5 Add message_type when importing json 2025-05-05 17:10:57 +08:00
KnugiHK
4ca56b1c5c Bug fix on wrong type of self.message_type 2025-05-05 17:08:35 +08:00
KnugiHK
60790d89e3 Remove args.incremental_merge from device type check 2025-05-05 16:15:51 +08:00
KnugiHK
ed2ec7cb9e Exit if no json is found 2025-05-05 16:14:05 +08:00
KnugiHK
75c2db6d5c Accept both raw timestamp and formatted time string 2025-05-05 16:13:48 +08:00
KnugiHK
352be849a7 Bug fix on messages with timestamp being '0' 2025-05-05 16:13:17 +08:00
KnugiHK
3e3aeae7ad key_id can also be a string 2025-05-05 16:12:57 +08:00
KnugiHK
9d76cf60af Attach media_base from JSON 2025-05-05 16:12:13 +08:00
KnugiHK
eded9a140f Add new attributes to JSON 2025-05-05 16:11:10 +08:00
KnugiHK
5a9944d14b Respects users' choices on the output JSON 2025-05-05 16:09:53 +08:00
KnugiHK
b8652fcb96 Throwaway variable 2025-05-05 15:22:00 +08:00
KnugiHK
ad267a7226 Quote all paths in output messages 2025-05-05 15:20:46 +08:00
KnugiHK
534aea924d Add docs 2025-05-05 15:20:14 +08:00
fschuh
d0fc620ba6 Added print statement with merging media folder names 2025-05-05 00:41:10 -04:00
fschuh
1f9cbc3ad2 Updated .gitignore with some additional dev folders 2025-05-05 00:39:13 -04:00
fschuh
fab9bc7649 Added unit tests 2025-05-05 00:37:01 -04:00
fschuh
8d34300ea5 Merged JSON files are now only updated on disk if the contents have actually changed. 2025-05-04 22:55:42 -04:00
fschuh
fbffc16452 Added call to main() if directly executing __main__.py file 2025-05-04 15:58:53 -04:00
fschuh
2f15360526 Fixed remaining compatibility issues with latest code 2025-05-04 15:58:02 -04:00
Knugi
5291ed0d6f Update generate-website.yml 2025-05-04 08:10:17 +00:00
Knugi
cab54658ee Update generate-website.js 2025-05-04 08:05:22 +00:00
Knugi
96e5823faa Update LICENSE 2025-05-01 12:25:34 +00:00
KnugiHK
d7ba73047a Merge branch 'main' into dev 2025-04-29 22:01:13 +08:00
Knugi
81f072f899 Update generate-website.js 2025-04-29 13:22:05 +00:00
Knugi
2d8960d5e3 Update README.md 2025-04-29 13:20:14 +00:00
Knugi
bacbcda474 Update README.md 2025-04-29 08:55:31 +00:00
Knugi
9cfbb560eb Update generate-website.yml 2025-04-29 08:52:32 +00:00
Knugi
c37e505408 Update generate-website.yml 2025-04-29 08:49:57 +00:00
fschuh
f460f76441 Fixed issue on command line args validation 2025-04-29 01:22:11 -04:00
fschuh
0dda7b7bd9 Updated README.md with incremental merge help description 2025-04-29 01:22:11 -04:00
fschuh
7cf7329124 Updated help description 2025-04-29 01:22:11 -04:00
fschuh
1207b1e0cc Added support for incremental merging 2025-04-29 01:22:11 -04:00
KnugiHK
b3ce22ddbc Add docs.html to gh-page 2025-04-27 16:21:38 +08:00
Knugi
15d6674644 Delete CNAME 2025-04-27 08:16:50 +00:00
Knugi
07b525b0c6 Update README.md 2025-04-27 07:19:21 +00:00
KnugiHK
bd503a0c7f Update pyproject.toml 2025-04-27 15:16:57 +08:00
KnugiHK
dc639d5dac Update pyproject.toml 2025-04-27 14:40:48 +08:00
KnugiHK
ae6a65f98d Update generate-website.js 2025-04-27 14:07:51 +08:00
KnugiHK
578c961932 Add workflow for generating website from readme 2025-04-27 13:00:51 +08:00
40 changed files with 4869 additions and 1987 deletions

View File

489
.github/generate-website.js vendored Normal file
View File

@@ -0,0 +1,489 @@
const fs = require('fs-extra');
const marked = require('marked');
const path = require('path');
const markedAlert = require('marked-alert');
fs.ensureDirSync('docs');
fs.ensureDirSync('docs/imgs');
if (fs.existsSync('imgs')) {
fs.copySync('imgs', 'docs/imgs');
}
if (fs.existsSync('.github/docs.html')) {
fs.copySync('.github/docs.html', 'docs/docs.html');
}
const readmeContent = fs.readFileSync('README.md', 'utf8');
const toc = `<div class="table-of-contents">
<h3>Table of Contents</h3>
<ul>
<li><a href="#intro">Introduction</a></li>
<li><a href="#usage">Usage</a></li>
<li><a href="#todo">To Do</a></li>
<li><a href="#legal">Legal Stuff & Disclaimer</a></li>
</ul>
</div>
`
const generateHTML = (content) =>
`<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="title" content="WhatsApp Chat Exporter">
<meta name="description" content="Export your WhatsApp conversations from Android and iOS/iPadOS devices to HTML, JSON, or text formats. Supports encrypted backups (Crypt12, Crypt14, Crypt15) and customizable templates.">
<meta name="keywords" content="WhatsApp, WhatsApp Chat Exporter, WhatsApp export tool, WhatsApp backup decryption, Crypt12, Crypt14, Crypt15, WhatsApp database parser, WhatsApp chat history, HTML export, JSON export, text export, customizable templates, media handling, vCard import, Python tool, open source, MIT license">
<meta name="robots" content="index, follow">
<meta name="author" content="KnugiHK">
<meta name="license" content="MIT">
<meta name="generator" content="Python">
<title>WhatsApp Chat Exporter</title>
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.0/css/all.min.css">
<style>
:root {
--primary-color: #128C7E;
--secondary-color: #25D366;
--dark-color: #075E54;
--light-color: #DCF8C6;
--text-color: #333;
--light-text: #777;
--code-bg: #f6f8fa;
--border-color: #e1e4e8;
}
* {
margin: 0;
padding: 0;
box-sizing: border-box;
}
body {
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Helvetica, Arial, sans-serif;
line-height: 1.6;
color: var(--text-color);
background-color: #f9f9f9;
}
.container {
max-width: 1200px;
margin: 0 auto;
padding: 0 20px;
}
header {
background-color: var(--primary-color);
color: white;
padding: 60px 0 40px;
text-align: center;
box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
}
header h1 {
font-size: 2.8rem;
margin-bottom: 16px;
}
.badges {
margin: 20px 0;
display: flex;
justify-content: center;
flex-wrap: wrap;
gap: 10px;
}
.badge {
display: inline-block;
margin: 5px;
}
.tagline {
font-size: 1.2rem;
max-width: 800px;
margin: 0 auto;
padding: 0 20px;
}
.main-content {
background: white;
padding: 40px 0;
margin: 0;
}
.inner-content {
padding: 0 30px;
max-width: 900px;
margin: 0 auto;
}
h2 {
color: var(--dark-color);
margin: 30px 0 15px;
padding-bottom: 8px;
border-bottom: 2px solid var(--light-color);
font-size: 1.8rem;
}
h3 {
color: var(--dark-color);
margin: 25px 0 15px;
font-size: 1.4rem;
}
h4 {
color: var(--dark-color);
margin: 20px 0 10px;
font-size: 1.2rem;
}
p, ul, ol {
margin-bottom: 16px;
}
ul, ol {
padding-left: 25px;
}
a {
color: var(--primary-color);
text-decoration: none;
}
a:hover {
text-decoration: underline;
}
.alert {
background-color: #f8f9fa;
border-left: 4px solid #f0ad4e;
padding: 15px;
margin-bottom: 20px;
border-radius: 3px;
}
.alert--tip {
border-color: var(--secondary-color);
background-color: rgba(37, 211, 102, 0.1);
}
.alert--note {
border-color: #0088cc;
background-color: rgba(0, 136, 204, 0.1);
}
.markdown-alert {
background-color: #f8f9fa;
border-left: 4px solid #f0ad4e;
padding: 15px;
margin-bottom: 20px;
border-radius: 3px;
}
.markdown-alert-note {
border-color: #0088cc;
background-color: rgba(0, 136, 204, 0.1);
}
.markdown-alert-tip {
border-color: var(--secondary-color);
background-color: rgba(37, 211, 102, 0.1);
}
.markdown-alert-important {
border-color: #d9534f;
background-color: rgba(217, 83, 79, 0.1);
}
.markdown-alert-warning {
border-color: #f0ad4e;
background-color: rgba(240, 173, 78, 0.1);
}
.markdown-alert-caution {
border-color: #ff9800;
background-color: rgba(255, 152, 0, 0.1);
}
.markdown-alert p {
margin: 0;
}
.markdown-alert-title {
font-weight: 600;
margin-bottom: 8px;
display: flex;
align-items: center;
gap: 8px;
}
pre {
background-color: var(--code-bg);
border-radius: 6px;
padding: 16px;
overflow-x: auto;
margin: 16px 0;
border: 1px solid var(--border-color);
}
code {
font-family: SFMono-Regular, Consolas, Liberation Mono, Menlo, monospace;
font-size: 85%;
background-color: var(--code-bg);
padding: 0.2em 0.4em;
border-radius: 3px;
}
pre code {
padding: 0;
background-color: transparent;
}
.screenshot {
max-width: 100%;
border-radius: 8px;
box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1);
margin: 20px 0;
border: 1px solid var(--border-color);
}
.feature-grid {
display: grid;
grid-template-columns: repeat(auto-fill, minmax(280px, 1fr));
gap: 20px;
margin: 30px 0;
}
.feature-card {
background: white;
border-radius: 8px;
box-shadow: 0 2px 5px rgba(0, 0, 0, 0.1);
padding: 20px;
border: 1px solid var(--border-color);
transition: transform 0.3s ease;
}
.feature-card:hover {
transform: translateY(-5px);
box-shadow: 0 5px 15px rgba(0, 0, 0, 0.1);
}
.feature-icon {
font-size: 2rem;
color: var(--primary-color);
margin-bottom: 15px;
}
.feature-title {
font-weight: 600;
margin-bottom: 10px;
}
footer {
background-color: var(--dark-color);
color: white;
text-align: center;
padding: 30px 0;
margin-top: 50px;
}
.btn {
display: inline-block;
background-color: var(--primary-color);
color: white;
padding: 10px 20px;
border-radius: 4px;
text-decoration: none;
font-weight: 500;
transition: background-color 0.3s ease;
margin: 5px;
}
.btn:hover {
background-color: var(--dark-color);
text-decoration: none;
}
.btn-secondary {
background-color: white;
color: var(--primary-color);
border: 1px solid var(--primary-color);
}
.btn-secondary:hover {
background-color: var(--light-color);
color: var(--dark-color);
}
.action-buttons {
margin: 30px 0;
text-align: center;
}
.table-of-contents {
background-color: #f8f9fa;
border: 1px solid var(--border-color);
border-radius: 6px;
padding: 15px 25px;
margin: 30px 0;
}
.table-of-contents h3 {
margin-top: 0;
margin-bottom: 10px;
}
.table-of-contents ul {
margin-bottom: 0;
}
.help-text {
color: var(--light-text);
font-size: 0.9rem;
}
.device-section {
padding: 15px;
border: 1px solid var(--border-color);
border-radius: 6px;
margin-bottom: 20px;
background-color: #fff;
}
@media (max-width: 768px) {
header {
padding: 40px 0 30px;
}
header h1 {
font-size: 2.2rem;
}
.tagline {
font-size: 1.1rem;
}
.feature-grid {
grid-template-columns: 1fr;
}
}
</style>
</head>
<body>
<header>
<div class="container">
<h1>WhatsApp Chat Exporter</h1>
<div class="badges">
<a href="https://pypi.org/project/whatsapp-chat-exporter/" class="badge"><img src="https://img.shields.io/pypi/v/whatsapp-chat-exporter?label=Latest%20in%20PyPI" alt="Latest in PyPI"></a>
<a href="https://github.com/KnugiHK/WhatsApp-Chat-Exporter/blob/main/LICENSE" class="badge"><img src="https://img.shields.io/pypi/l/whatsapp-chat-exporter?color=427B93" alt="License MIT"></a>
<a href="https://pypi.org/project/Whatsapp-Chat-Exporter/" class="badge"><img src="https://img.shields.io/pypi/pyversions/Whatsapp-Chat-Exporter" alt="Python"></a>
<a href="https://matrix.to/#/#wtsexporter:matrix.org" class="badge"><img src="https://img.shields.io/matrix/wtsexporter:matrix.org.svg?label=Matrix%20Chat%20Room" alt="Matrix Chat Room"></a>
</div>
<p class="tagline">A customizable Android and iPhone Whatsapp database parser that will give you the history of your Whatsapp conversations in HTML and JSON</p>
<div class="action-buttons">
<a href="https://github.com/KnugiHK/WhatsApp-Chat-Exporter" class="btn"><i class="fab fa-github"></i> GitHub</a>
<a href="https://pypi.org/project/whatsapp-chat-exporter/" class="btn btn-secondary"><i class="fab fa-python"></i> PyPI</a>
</div>
</div>
</header>
<div class="main-content">
<div class="inner-content">
<section id="features">
<h2>Key Features</h2>
<div class="feature-grid">
<div class="feature-card">
<div class="feature-icon"><i class="fas fa-mobile-alt"></i></div>
<h3 class="feature-title">Cross-Platform</h3>
<p>Support for both Android and iOS/iPadOS WhatsApp databases</p>
</div>
<div class="feature-card">
<div class="feature-icon"><i class="fas fa-lock"></i></div>
<h3 class="feature-title">Backup Decryption</h3>
<p>Support for Crypt12, Crypt14, and Crypt15 (End-to-End) encrypted backups</p>
</div>
<div class="feature-card">
<div class="feature-icon"><i class="fas fa-file-export"></i></div>
<h3 class="feature-title">Multiple Formats</h3>
<p>Export your chats in HTML, JSON, and text formats</p>
</div>
<div class="feature-card">
<div class="feature-icon"><i class="fas fa-paint-brush"></i></div>
<h3 class="feature-title">Customizable</h3>
<p>Use custom HTML templates and styling for your chat exports</p>
</div>
<div class="feature-card">
<div class="feature-icon"><i class="fas fa-images"></i></div>
<h3 class="feature-title">Media Support</h3>
<p>Properly handles and organizes your media files in the exports</p>
</div>
<div class="feature-card">
<div class="feature-icon"><i class="fas fa-filter"></i></div>
<h3 class="feature-title">Filtering Options</h3>
<p>Filter chats by date, phone number, and more</p>
</div>
</div>
</section>
<div class="readme-content">
${content}
</div>
<div class="action-buttons">
<a href="https://github.com/KnugiHK/WhatsApp-Chat-Exporter" class="btn"><i class="fab fa-github"></i> View on GitHub</a>
<a href="https://pypi.org/project/whatsapp-chat-exporter/" class="btn btn-secondary"><i class="fab fa-python"></i> PyPI Package</a>
</div>
</div>
</div>
<footer>
<div class="container">
<p>© 2021-${new Date().getFullYear()} WhatsApp Chat Exporter</p>
<p>Licensed under MIT License</p>
<p>
<a href="https://github.com/KnugiHK/WhatsApp-Chat-Exporter" style="color: white; margin: 0 10px;"><i class="fab fa-github fa-lg"></i></a>
<a href="https://matrix.to/#/#wtsexporter:matrix.org" style="color: white; margin: 0 10px;"><i class="fas fa-comments fa-lg"></i></a>
</p>
<p><small>Last updated: ${new Date().toLocaleDateString()}</small></p>
</div>
</footer>
<script>
// Simple script to handle smooth scrolling for anchor links
document.querySelectorAll('a[href^="#"]').forEach(anchor => {
anchor.addEventListener('click', function(e) {
e.preventDefault();
const targetId = this.getAttribute('href');
const targetElement = document.querySelector(targetId);
if (targetElement) {
window.scrollTo({
top: targetElement.offsetTop - 20,
behavior: 'smooth'
});
}
});
});
</script>
</body>
</html>
`;
const processedContent = readmeContent.replace(/\[!\[.*?\]\(.*?\)\]\(.*?\)/g, '')
const htmlContent = marked.use(markedAlert()).parse(processedContent, {
gfm: true,
breaks: true,
renderer: new marked.Renderer()
});
const finalHTML = generateHTML(htmlContent);
fs.writeFileSync('docs/index.html', finalHTML);
console.log('Website generated successfully!');

View File

@@ -1,8 +1,11 @@
# Important Note
**All PRs (except for changes unrelated to source files) should target and start from the `dev` branch.**
## Related Issue
- Please reference the related issue here (e.g., `Fixes #123` or `Closes #456`), if there are any.
- Please put a reference to the related issue here (e.g., `Fixes #123` or `Closes #456`), if there are any.
## Description of Changes
- Briefly describe the changes made in this PR. Explain the purpose, the implementation details, and any important information that reviewers should be aware of.
## Important (Please remove this section before submitting the PR)
- Before submitting this PR, please make sure to look at **[this issue](https://github.com/KnugiHK/WhatsApp-Chat-Exporter/issues/137)**. It contains crucial context and discussion that may affect the changes in this PR.
- Briefly describe the changes made in this PR. Explain the purpose, the implementation details, and any important information that reviewers should be aware of.

50
.github/workflows/ci.yml vendored Normal file
View File

@@ -0,0 +1,50 @@
name: Run Pytest on Dev Branch Push
on:
push:
branches:
- dev
pull_request:
jobs:
ci:
runs-on: ${{ matrix.os }}
permissions:
contents: read
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest]
python-version: ["3.10", "3.11", "3.12", "3.13", "3.14"]
include:
- os: windows-latest
python-version: "3.13"
python_utf8: "1"
- os: macos-latest
python-version: "3.13"
- os: windows-11-arm
python-version: "3.13"
python_utf8: "1"
- os: macos-15-intel
python-version: "3.13"
- os: windows-latest
python-version: "3.14"
python_utf8: "1"
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} on ${{ matrix.os }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[all] pytest nuitka
- name: Run pytest
env:
PYTHONUTF8: ${{ matrix.python_utf8 || '0' }}
run: pytest

100
.github/workflows/codeql.yml vendored Normal file
View File

@@ -0,0 +1,100 @@
# For most projects, this workflow file will not need changing; you simply need
# to commit it to your repository.
#
# You may wish to alter this file to override the set of languages analyzed,
# or to provide custom queries or build logic.
#
# ******** NOTE ********
# We have attempted to detect the languages in your repository. Please check
# the `language` matrix defined below to confirm you have the correct set of
# supported CodeQL languages.
#
name: "CodeQL Advanced"
on:
push:
branches: [ "main", "dev" ]
pull_request:
branches: [ "main", "dev" ]
schedule:
- cron: '25 21 * * 5'
jobs:
analyze:
name: Analyze (${{ matrix.language }})
# Runner size impacts CodeQL analysis time. To learn more, please see:
# - https://gh.io/recommended-hardware-resources-for-running-codeql
# - https://gh.io/supported-runners-and-hardware-resources
# - https://gh.io/using-larger-runners (GitHub.com only)
# Consider using larger runners or machines with greater resources for possible analysis time improvements.
runs-on: ${{ (matrix.language == 'swift' && 'macos-latest') || 'ubuntu-latest' }}
permissions:
# required for all workflows
security-events: write
# required to fetch internal or private CodeQL packs
packages: read
# only required for workflows in private repositories
actions: read
contents: read
strategy:
fail-fast: false
matrix:
include:
- language: actions
build-mode: none
- language: python
build-mode: none
# CodeQL supports the following values keywords for 'language': 'actions', 'c-cpp', 'csharp', 'go', 'java-kotlin', 'javascript-typescript', 'python', 'ruby', 'swift'
# Use `c-cpp` to analyze code written in C, C++ or both
# Use 'java-kotlin' to analyze code written in Java, Kotlin or both
# Use 'javascript-typescript' to analyze code written in JavaScript, TypeScript or both
# To learn more about changing the languages that are analyzed or customizing the build mode for your analysis,
# see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/customizing-your-advanced-setup-for-code-scanning.
# If you are analyzing a compiled language, you can modify the 'build-mode' for that language to customize how
# your codebase is analyzed, see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/codeql-code-scanning-for-compiled-languages
steps:
- name: Checkout repository
uses: actions/checkout@v4
# Add any setup steps before running the `github/codeql-action/init` action.
# This includes steps like installing compilers or runtimes (`actions/setup-node`
# or others). This is typically only required for manual builds.
# - name: Setup runtime (example)
# uses: actions/setup-example@v1
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v4
with:
languages: ${{ matrix.language }}
build-mode: ${{ matrix.build-mode }}
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.
# For more details on CodeQL's query packs, refer to: https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs
# queries: security-extended,security-and-quality
# If the analyze step fails for one of the languages you are analyzing with
# "We were unable to automatically build your code", modify the matrix above
# to set the build mode to "manual" for that language. Then modify this step
# to build your code.
# Command-line programs to run using the OS shell.
# 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun
- if: matrix.build-mode == 'manual'
shell: bash
run: |
echo 'If you are using a "manual" build mode for one or more of the' \
'languages you are analyzing, replace this with the commands to build' \
'your code, for example:'
echo ' make bootstrap'
echo ' make release'
exit 1
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v4
with:
category: "/language:${{matrix.language}}"

View File

@@ -7,78 +7,146 @@ on:
permissions:
contents: read
id-token: write
attestations: write
jobs:
linux:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v6
- name: Set up Python
uses: actions/setup-python@v5
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pycryptodome vobject javaobj-py3 ordered-set zstandard nuitka==2.6.7
pip install pycryptodome javaobj-py3 ordered-set zstandard nuitka==2.8.9
pip install .
- name: Build binary with Nuitka
run: |
python -m nuitka --no-deployment-flag=self-execution --onefile \
python -m nuitka --onefile \
--include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html \
--assume-yes-for-downloads --follow-imports Whatsapp_Chat_Exporter/__main__.py --output-filename=wtsexporter_linux_x64
--assume-yes-for-downloads Whatsapp_Chat_Exporter --output-filename=wtsexporter_linux_x64
sha256sum wtsexporter_linux_x64
- uses: actions/upload-artifact@v4
- name: Generate artifact attestation
uses: actions/attest-build-provenance@v3
with:
name: binary-linux
path: |
./wtsexporter_linux_x64
subject-path: ./wtsexporter_linux_x64
- uses: actions/upload-artifact@v6
with:
name: binary-linux-x64
path: ./wtsexporter_linux_x64
windows:
windows-x64:
runs-on: windows-latest
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v6
- name: Set up Python
uses: actions/setup-python@v5
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pycryptodome vobject javaobj-py3 ordered-set zstandard nuitka==2.6.7
pip install pycryptodome javaobj-py3 ordered-set zstandard nuitka==2.8.9
pip install .
- name: Build binary with Nuitka
run: |
python -m nuitka --no-deployment-flag=self-execution --onefile --include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html --assume-yes-for-downloads --follow-imports Whatsapp_Chat_Exporter\__main__.py --output-filename=wtsexporter
copy wtsexporter.exe wtsexporter_x64.exe
Get-FileHash wtsexporter_x64.exe
- uses: actions/upload-artifact@v4
python -m nuitka --onefile --include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html --assume-yes-for-downloads Whatsapp_Chat_Exporter --output-filename=wtsexporter
Rename-Item -Path "wtsexporter.exe" -NewName "wtsexporter_win_x64.exe"
Get-FileHash wtsexporter_win_x64.exe
- name: Generate artifact attestation
uses: actions/attest-build-provenance@v3
with:
name: binary-windows
path: |
.\wtsexporter_x64.exe
subject-path: .\wtsexporter_win_x64.exe
- uses: actions/upload-artifact@v6
with:
name: binary-windows-x64
path: .\wtsexporter_win_x64.exe
macos:
windows-arm:
runs-on: windows-11-arm
steps:
- uses: actions/checkout@v6
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pycryptodome javaobj-py3 ordered-set zstandard nuitka==2.8.9
pip install .
- name: Build binary with Nuitka
run: |
python -m nuitka --onefile --include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html --assume-yes-for-downloads Whatsapp_Chat_Exporter --output-filename=wtsexporter
Rename-Item -Path "wtsexporter.exe" -NewName "wtsexporter_win_arm64.exe"
Get-FileHash wtsexporter_win_arm64.exe
- name: Generate artifact attestation
uses: actions/attest-build-provenance@v3
with:
subject-path: .\wtsexporter_win_arm64.exe
- uses: actions/upload-artifact@v6
with:
name: binary-windows-arm64
path: .\wtsexporter_win_arm64.exe
macos-arm:
runs-on: macos-latest
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v6
- name: Set up Python
uses: actions/setup-python@v5
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pycryptodome vobject javaobj-py3 ordered-set zstandard nuitka==2.6.7
pip install pycryptodome javaobj-py3 ordered-set zstandard nuitka==2.8.9
pip install .
- name: Build binary with Nuitka
run: |
python -m nuitka --no-deployment-flag=self-execution --onefile \
python -m nuitka --onefile \
--include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html \
--assume-yes-for-downloads --follow-imports Whatsapp_Chat_Exporter/__main__.py --output-filename=wtsexporter_macos_x64
shasum -a 256 wtsexporter_macos_x64
- uses: actions/upload-artifact@v4
--assume-yes-for-downloads Whatsapp_Chat_Exporter --output-filename=wtsexporter
mv wtsexporter wtsexporter_macos_arm64
shasum -a 256 wtsexporter_macos_arm64
- name: Generate artifact attestation
uses: actions/attest-build-provenance@v3
with:
name: binary-macos
path: |
./wtsexporter_macos_x64
subject-path: ./wtsexporter_macos_arm64
- uses: actions/upload-artifact@v6
with:
name: binary-macos-arm64
path: ./wtsexporter_macos_arm64
macos-intel:
runs-on: macos-15-intel
steps:
- uses: actions/checkout@v6
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pycryptodome javaobj-py3 ordered-set zstandard nuitka==2.8.9
pip install .
- name: Build binary with Nuitka
run: |
python -m nuitka --onefile \
--include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html \
--assume-yes-for-downloads Whatsapp_Chat_Exporter --output-filename=wtsexporter
mv wtsexporter wtsexporter_macos_x64
shasum -a 256 wtsexporter_macos_x64
- name: Generate artifact attestation
uses: actions/attest-build-provenance@v3
with:
subject-path: ./wtsexporter_macos_x64
- uses: actions/upload-artifact@v6
with:
name: binary-macos-x64
path: ./wtsexporter_macos_x64

43
.github/workflows/generate-website.yml vendored Normal file
View File

@@ -0,0 +1,43 @@
name: Generate Website from README
on:
push:
branches:
- main
paths:
- 'README.md'
- '.github/workflows/generate-website.yml'
- '.github/generate-website.js'
- '.github/docs.html'
workflow_dispatch:
permissions:
contents: write
pages: write
jobs:
build-and-deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v6
- name: Set up Node.js
uses: actions/setup-node@v6
with:
node-version: '24'
- name: Install dependencies
run: npm install marked fs-extra marked-alert
- name: Generate website from README
run: |
node .github/generate-website.js
echo 'wts.knugi.dev' > ./docs/CNAME
- name: Deploy to gh-pages
if: github.ref == 'refs/heads/main' # Ensure deployment only happens from main
uses: peaceiris/actions-gh-pages@4f9cc6602d3f66b9c108549d475ec49e8ef4d45e
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./docs
publish_branch: gh-pages

2
.gitignore vendored
View File

@@ -138,7 +138,9 @@ __main__
# Dev time intermidiates & temp files
result/
output/
WhatsApp/
AppDomainGroup-group.net.whatsapp.WhatsApp.shared/
/*.db
/*.db-*
/myout

1
CNAME
View File

@@ -1 +0,0 @@
wts.knugi.dev

View File

@@ -1,6 +1,6 @@
MIT License
Copyright (c) 2021-2023 Knugi
Copyright (c) 2021-2026 Knugi
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal

View File

@@ -1,36 +0,0 @@
The Whatsapp Chat Exporter is licensed under the MIT license. For more information,
refer to the file LICENSE.
Whatsapp Chat Exporter incorporates code from Django, governed by the three-clause
BSD license—a permissive open-source license. The copyright and license details are
provided below to adhere to Django's terms.
------
Copyright (c) Django Software Foundation and individual contributors.
All rights reserved.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. Neither the name of Django nor the names of its contributors may be used
to endorse or promote products derived from this software without
specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

129
README.md
View File

@@ -3,7 +3,7 @@
[![License MIT](https://img.shields.io/pypi/l/whatsapp-chat-exporter?color=427B93)](https://github.com/KnugiHK/WhatsApp-Chat-Exporter/blob/main/LICENSE)
[![Python](https://img.shields.io/pypi/pyversions/Whatsapp-Chat-Exporter)](https://pypi.org/project/Whatsapp-Chat-Exporter/)
[![Matrix Chat Room](https://img.shields.io/matrix/wtsexporter:matrix.org.svg?label=Matrix%20Chat%20Room)](https://matrix.to/#/#wtsexporter:matrix.org)
![Since 2021](https://img.shields.io/github/created-at/knugihk/WhatsApp-Chat-Exporter?label=Since&color=purple)
[![Since 2021](https://img.shields.io/github/created-at/knugihk/WhatsApp-Chat-Exporter?label=Since&color=purple)](https://wts.knugi.dev)
A customizable Android and iPhone Whatsapp database parser that will give you the history of your Whatsapp conversations in HTML and JSON. Inspired by [Telegram Chat Export Tool](https://telegram.org/blog/export-and-more).
> [!TIP]
@@ -17,6 +17,8 @@ To contribute, see the [Contributing Guidelines](https://github.com/KnugiHK/What
> [!NOTE]
> Usage in README may be removed in the future. Check the usage in [Wiki](https://github.com/KnugiHK/Whatsapp-Chat-Exporter/wiki)
>
> Click [here](https://github.com/KnugiHK/WhatsApp-Chat-Exporter/wiki/Android-Usage#crypt15-end-to-end-encrypted-backup) for the most trivia way for exporting from Android
First, install the exporter by:
```shell
@@ -50,7 +52,7 @@ wtsexporter -a
The default WhatsApp contact database typically contained contact names extracted from your phone, which the exporter used to map your chats. However, in some reported cases, the database may have never been populated. In such case, you can export your contacts to a vCard file from your phone or a cloud provider like Google Contacts. Then, install the necessary dependency and run the following command from the shell:
```sh
pip install whatsapp-chat-exporter["vcards"]
wtsexporter -a --enrich-from-vcard contacts.vcf --default-country-code 852
wtsexporter -a --enrich-from-vcards contacts.vcf --default-country-code 852
```
### Encrypted Android WhatsApp Backup
@@ -99,7 +101,7 @@ wtsexporter -a -k encrypted_backup.key -b msgstore.db.crypt15
```
If you have the 32 bytes hex key, simply put the hex key in the -k option and invoke the command from shell like this:
```sh
wtsexporter -a -k 432435053b5204b08e5c3823423399aa30ff061435ab89bc4e6713969cdaa5a8 -b msgstore.db.crypt15
wtsexporter -a -k 133735053b5204b08e5c3823423399aa30ff061435ab89bc4e6713969cda1337 -b msgstore.db.crypt15
```
## Working with iOS/iPadOS (iPhone or iPad)
@@ -134,33 +136,42 @@ wtsexporter -i -b ~/Library/Application\ Support/MobileSync/Backup/[device id]
```
## Results
After extracting, you will get these:
#### Private Message
After extracting, you will get this:
![Private Message](imgs/pm.png)
#### Group Message
![Group Message](imgs/group.png)
## Working with Business
If you are working with WhatsApp Business, add the `--business` flag to the command
```sh
wtsexporter -a --business ...other flags
wtsexporter -i --business ...other flags
```
## More options
Invoke the wtsexporter with --help option will show you all options available.
```sh
> wtsexporter --help
usage: wtsexporter [-h] [-a] [-i] [-e EXPORTED] [-w WA] [-m MEDIA] [-b BACKUP] [-d DB] [-k [KEY]]
[--call-db [CALL_DB_IOS]] [--wab WAB] [-o OUTPUT] [-j [JSON]] [--txt [TEXT_FORMAT]] [--no-html]
[--size [SIZE]] [--avoid-encoding-json] [--pretty-print-json [PRETTY_PRINT_JSON]] [--per-chat]
[--import] [-t TEMPLATE] [--offline OFFLINE] [--no-avatar] [--experimental-new-theme]
[--headline HEADLINE] [-c] [--create-separated-media] [--time-offset {-12 to 14}] [--date DATE]
usage: wtsexporter [-h] [--debug] [-a] [-i] [-e EXPORTED] [-w WA] [-m MEDIA] [-b BACKUP] [-d DB]
[-k [KEY]] [--call-db [CALL_DB_IOS]] [--wab WAB] [-o OUTPUT] [-j [JSON]]
[--txt [TEXT_FORMAT]] [--no-html] [--size [SIZE]] [--no-reply] [--avoid-encoding-json]
[--pretty-print-json [PRETTY_PRINT_JSON]] [--tg] [--per-chat] [--import] [-t TEMPLATE]
[--offline OFFLINE] [--no-avatar] [--old-theme] [--headline HEADLINE] [-c]
[--create-separated-media] [--time-offset {-12 to 14}] [--date DATE]
[--date-format FORMAT] [--include [phone number ...]] [--exclude [phone number ...]]
[--dont-filter-empty] [--enrich-from-vcards ENRICH_FROM_VCARDS]
[--default-country-code DEFAULT_COUNTRY_CODE] [-s] [--check-update] [--assume-first-as-me]
[--business] [--decrypt-chunk-size DECRYPT_CHUNK_SIZE]
[--max-bruteforce-worker MAX_BRUTEFORCE_WORKER]
[--default-country-code DEFAULT_COUNTRY_CODE] [--incremental-merge]
[--source-dir SOURCE_DIR] [--target-dir TARGET_DIR] [-s] [--check-update]
[--check-update-pre] [--assume-first-as-me] [--business]
[--decrypt-chunk-size DECRYPT_CHUNK_SIZE]
[--max-bruteforce-worker MAX_BRUTEFORCE_WORKER] [--no-banner] [--fix-dot-files]
A customizable Android and iOS/iPadOS WhatsApp database parser that will give you the history of your WhatsApp
conversations in HTML and JSON. Android Backup Crypt12, Crypt14 and Crypt15 supported.
A customizable Android and iOS/iPadOS WhatsApp database parser that will give you the history of your
WhatsApp conversations in HTML and JSON. Android Backup Crypt12, Crypt14 and Crypt15 supported.
options:
-h, --help show this help message and exit
--debug Enable debug mode
Device Type:
-a, --android Define the target as Android
@@ -172,9 +183,10 @@ Input Files:
-w, --wa WA Path to contact database (default: wa.db/ContactsV2.sqlite)
-m, --media MEDIA Path to WhatsApp media folder (default: WhatsApp)
-b, --backup BACKUP Path to Android (must be used together with -k)/iOS WhatsApp backup
-d, --db DB Path to database file (default: msgstore.db/7c7fba66680ef796b916b067077cc246adacf01d)
-k, --key [KEY] Path to key file. If this option is set for crypt15 backup but nothing is specified, you will
be prompted to enter the key.
-d, --db DB Path to database file (default:
msgstore.db/7c7fba66680ef796b916b067077cc246adacf01d)
-k, --key [KEY] Path to key file. If this option is set for crypt15 backup but nothing is
specified, you will be prompted to enter the key.
--call-db [CALL_DB_IOS]
Path to call database (default: 1b432994e958845fffe8e2f190f26d1511534088) iOS only
--wab, --wa-backup WAB
@@ -183,17 +195,20 @@ Input Files:
Output Options:
-o, --output OUTPUT Output to specific directory (default: result)
-j, --json [JSON] Save the result to a single JSON file (default if present: result.json)
--txt [TEXT_FORMAT] Export chats in text format similar to what WhatsApp officially provided (default if present:
result/)
--txt [TEXT_FORMAT] Export chats in text format similar to what WhatsApp officially provided (default
if present: result/)
--no-html Do not output html files
--size, --output-size, --split [SIZE]
Maximum (rough) size of a single output file in bytes, 0 for auto
--no-reply Do not process replies (iOS only) (default: handle replies)
JSON Options:
--avoid-encoding-json
Don't encode non-ascii characters in the output JSON files
--pretty-print-json [PRETTY_PRINT_JSON]
Pretty print the output JSON.
--tg, --telegram Output the JSON in a format compatible with Telegram export (implies json-per-
chat)
--per-chat Output the JSON file per chat
--import Import JSON file and convert to HTML output
@@ -202,9 +217,9 @@ HTML Options:
Path to custom HTML template
--offline OFFLINE Relative path to offline static files
--no-avatar Do not render avatar in HTML output
--experimental-new-theme
Use the newly designed WhatsApp-alike theme
--headline HEADLINE The custom headline for the HTML output. Use '??' as a placeholder for the chat name
--old-theme Use the old Telegram-alike theme
--headline HEADLINE The custom headline for the HTML output. Use '??' as a placeholder for the chat
name
Media Handling:
-c, --move-media Move the media directory to output directory if the flag is set, otherwise copy it
@@ -220,35 +235,77 @@ Filtering Options:
Include chats that match the supplied phone number
--exclude [phone number ...]
Exclude chats that match the supplied phone number
--dont-filter-empty By default, the exporter will not render chats with no valid message. Setting this flag will
cause the exporter to render those. This is useful if chat(s) are missing from the output
--dont-filter-empty By default, the exporter will not render chats with no valid message. Setting this
flag will cause the exporter to render those. This is useful if chat(s) are
missing from the output
Contact Enrichment:
--enrich-from-vcards ENRICH_FROM_VCARDS
Path to an exported vcf file from Google contacts export. Add names missing from WhatsApp's
default database
Path to an exported vcf file from Google contacts export. Add names missing from
WhatsApp's default database
--default-country-code DEFAULT_COUNTRY_CODE
Use with --enrich-from-vcards. When numbers in the vcf file does not have a country code, this
will be used. 1 is for US, 66 for Thailand etc. Most likely use the number of your own country
Use with --enrich-from-vcards. When numbers in the vcf file does not have a
country code, this will be used. 1 is for US, 66 for Thailand etc. Most likely use
the number of your own country
Incremental Merging:
--incremental-merge Performs an incremental merge of two exports. Requires setting both --source-dir
and --target-dir. The chats (JSON files only) and media from the source directory
will be merged into the target directory. No chat messages or media will be
deleted from the target directory; only new chat messages and media will be added
to it. This enables chat messages and media to be deleted from the device to free
up space, while ensuring they are preserved in the exported backups.
--source-dir SOURCE_DIR
Sets the source directory. Used for performing incremental merges.
--target-dir TARGET_DIR
Sets the target directory. Used for performing incremental merges.
Miscellaneous:
-s, --showkey Show the HEX key used to decrypt the database
--check-update Check for updates (require Internet access)
--check-update-pre Check for updates including pre-releases (require Internet access)
--assume-first-as-me Assume the first message in a chat as sent by me (must be used together with -e)
--business Use Whatsapp Business default files (iOS only)
--decrypt-chunk-size DECRYPT_CHUNK_SIZE
Specify the chunk size for decrypting iOS backup, which may affect the decryption speed.
Specify the chunk size for decrypting iOS backup, which may affect the decryption
speed.
--max-bruteforce-worker MAX_BRUTEFORCE_WORKER
Specify the maximum number of worker for bruteforce decryption.
--no-banner Do not show the banner
--fix-dot-files Fix files with a dot at the end of their name (allowing the outputs be stored in
FAT filesystems)
WhatsApp Chat Exporter: 0.12.0 Licensed with MIT. See https://wts.knugi.dev/docs?dest=osl for all open source
licenses.
WhatsApp Chat Exporter: 0.13.0 Licensed with MIT. See https://wts.knugi.dev/docs?dest=osl for all open
source licenses.
```
# To do
See [issues](https://github.com/KnugiHK/Whatsapp-Chat-Exporter/issues).
# Verifying Build Integrity
To ensure that the binaries provided in the releases were built directly from this source code via GitHub Actions and have not been tampered with, GitHub Artifact Attestations is used. You can verify the authenticity of any pre-built binaries using the GitHub CLI.
> [!NOTE]
> Requires version 0.13.0 or newer. Legacy binaries are unsupported.
### Using Bash (Linux/WSL/macOS)
```bash
for file in wtsexporter*; do ; gh attestation verify "$file" -R KnugiHK/WhatsApp-Chat-Exporter; done
```
### Using PowerShell (Windows)
```powershell
gci "wtsexporter*" | % { gh attestation verify $_.FullName -R KnugiHK/WhatsApp-Chat-Exporter }
```
# Python Support Policy
This project officially supports all non-EOL (End-of-Life) versions of Python. Once a Python version reaches EOL, it is dropped in the next release. See [Python's EOL Schedule](https://devguide.python.org/versions/).
# Legal Stuff & Disclaimer
This is a MIT licensed project.
The Telegram Desktop's export is the reference for whatsapp.html in this repo.

View File

@@ -7,39 +7,61 @@ import shutil
import json
import string
import glob
import logging
import importlib.metadata
from Whatsapp_Chat_Exporter import android_crypt, exported_handler, android_handler
from Whatsapp_Chat_Exporter import ios_handler, ios_media_handler
from Whatsapp_Chat_Exporter.data_model import ChatCollection, ChatStore
from Whatsapp_Chat_Exporter.utility import APPLE_TIME, Crypt, check_update, DbType
from Whatsapp_Chat_Exporter.utility import readable_to_bytes, sanitize_filename
from Whatsapp_Chat_Exporter.utility import import_from_json, bytes_to_readable
from Whatsapp_Chat_Exporter.data_model import ChatCollection, ChatStore, Timing
from Whatsapp_Chat_Exporter.utility import APPLE_TIME, CURRENT_TZ_OFFSET, Crypt
from Whatsapp_Chat_Exporter.utility import readable_to_bytes, safe_name, bytes_to_readable
from Whatsapp_Chat_Exporter.utility import import_from_json, incremental_merge, check_update
from Whatsapp_Chat_Exporter.utility import telegram_json_format, convert_time_unit, DbType
from Whatsapp_Chat_Exporter.utility import get_transcription_selection, check_jid_map
from argparse import ArgumentParser, SUPPRESS
from datetime import datetime
from getpass import getpass
from tqdm import tqdm
from sys import exit
from typing import Tuple, Optional, List, Dict, Any, Union
from typing import Optional, List, Dict
from Whatsapp_Chat_Exporter.vcards_contacts import ContactsFromVCards
# Try to import vobject for contacts processing
try:
import vobject
except ModuleNotFoundError:
vcards_deps_installed = False
else:
from Whatsapp_Chat_Exporter.vcards_contacts import ContactsFromVCards
vcards_deps_installed = True
__version__ = importlib.metadata.version("whatsapp_chat_exporter")
WTSEXPORTER_BANNER = f"""========================================================================================================
██╗ ██╗██╗ ██╗ █████╗ ████████╗███████╗ █████╗ ██████╗ ██████╗
██║ ██║██║ ██║██╔══██╗╚══██╔══╝██╔════╝██╔══██╗██╔══██╗██╔══██╗
██║ █╗ ██║███████║███████║ ██║ ███████╗███████║██████╔╝██████╔╝
██║███╗██║██╔══██║██╔══██║ ██║ ╚════██║██╔══██║██╔═══╝ ██╔═══╝
╚███╔███╔╝██║ ██║██║ ██║ ██║ ███████║██║ ██║██║ ██║
╚══╝╚══╝ ╚═╝ ╚═╝╚═╝ ╚═╝ ╚═╝ ╚══════╝╚═╝ ╚═╝╚═╝ ╚═╝
██████╗██╗ ██╗ █████╗ ████████╗ ███████╗██╗ ██╗██████╗ ██████╗ ██████╗ ████████╗███████╗██████╗
██╔════╝██║ ██║██╔══██╗╚══██╔══╝ ██╔════╝╚██╗██╔╝██╔══██╗██╔═══██╗██╔══██╗╚══██╔══╝██╔════╝██╔══██╗
██║ ███████║███████║ ██║ █████╗ ╚███╔╝ ██████╔╝██║ ██║██████╔╝ ██║ █████╗ ██████╔╝
██║ ██╔══██║██╔══██║ ██║ ██╔══╝ ██╔██╗ ██╔═══╝ ██║ ██║██╔══██╗ ██║ ██╔══╝ ██╔══██╗
╚██████╗██║ ██║██║ ██║ ██║ ███████╗██╔╝ ██╗██║ ╚██████╔╝██║ ██║ ██║ ███████╗██║ ██║
╚═════╝╚═╝ ╚═╝╚═╝ ╚═╝ ╚═╝ ╚══════╝╚═╝ ╚═╝╚═╝ ╚═════╝ ╚═╝ ╚═╝ ╚═╝ ╚══════╝╚═╝ ╚═╝
WhatsApp Chat Exporter: A customizable Android and iOS/iPadOS WhatsApp database parser
{f"Version: {__version__}".center(104)}
========================================================================================================"""
def setup_argument_parser() -> ArgumentParser:
"""Set up and return the argument parser with all options."""
parser = ArgumentParser(
description='A customizable Android and iOS/iPadOS WhatsApp database parser that '
'will give you the history of your WhatsApp conversations in HTML '
'and JSON. Android Backup Crypt12, Crypt14 and Crypt15 supported.',
epilog=f'WhatsApp Chat Exporter: {importlib.metadata.version("whatsapp_chat_exporter")} Licensed with MIT. See '
'https://wts.knugi.dev/docs?dest=osl for all open source licenses.'
'will give you the history of your WhatsApp conversations in HTML '
'and JSON. Android Backup Crypt12, Crypt14 and Crypt15 supported.',
epilog=f'WhatsApp Chat Exporter: {__version__} Licensed with MIT. See '
'https://wts.knugi.dev/docs?dest=osl for all open source licenses.'
)
# General options
parser.add_argument(
"--debug", dest="debug", default=False, action='store_true',
help="Enable debug mode"
)
# Device type arguments
device_group = parser.add_argument_group('Device Type')
device_group.add_argument(
@@ -54,7 +76,7 @@ def setup_argument_parser() -> ArgumentParser:
"-e", "--exported", dest="exported", default=None,
help="Define the target as exported chat file and specify the path to the file"
)
# Input file paths
input_group = parser.add_argument_group('Input Files')
input_group.add_argument(
@@ -86,7 +108,7 @@ def setup_argument_parser() -> ArgumentParser:
"--wab", "--wa-backup", dest="wab", default=None,
help="Path to contact database in crypt15 format"
)
# Output options
output_group = parser.add_argument_group('Output Options')
output_group.add_argument(
@@ -106,10 +128,14 @@ def setup_argument_parser() -> ArgumentParser:
help="Do not output html files"
)
output_group.add_argument(
"--size", "--output-size", "--split", dest="size", nargs='?', const=0, default=None,
"--size", "--output-size", "--split", dest="size", nargs='?', const="0", default=None,
help="Maximum (rough) size of a single output file in bytes, 0 for auto"
)
output_group.add_argument(
"--no-reply", dest="no_reply_ios", default=False, action='store_true',
help="Do not process replies (iOS only) (default: handle replies)"
)
# JSON formatting options
json_group = parser.add_argument_group('JSON Options')
json_group.add_argument(
@@ -120,6 +146,10 @@ def setup_argument_parser() -> ArgumentParser:
'--pretty-print-json', dest='pretty_print_json', default=None, nargs='?', const=2, type=int,
help="Pretty print the output JSON."
)
json_group.add_argument(
"--tg", "--telegram", dest="telegram", default=False, action='store_true',
help="Output the JSON in a format compatible with Telegram export (implies json-per-chat)"
)
json_group.add_argument(
"--per-chat", dest="json_per_chat", default=False, action='store_true',
help="Output the JSON file per chat"
@@ -128,7 +158,7 @@ def setup_argument_parser() -> ArgumentParser:
"--import", dest="import_json", default=False, action='store_true',
help="Import JSON file and convert to HTML output"
)
# HTML options
html_group = parser.add_argument_group('HTML Options')
html_group.add_argument(
@@ -148,14 +178,14 @@ def setup_argument_parser() -> ArgumentParser:
help="Do not render avatar in HTML output"
)
html_group.add_argument(
"--experimental-new-theme", dest="whatsapp_theme", default=False, action='store_true',
help="Use the newly designed WhatsApp-alike theme"
"--old-theme", dest="telegram_theme", default=False, action='store_true',
help="Use the old Telegram-alike theme"
)
html_group.add_argument(
"--headline", dest="headline", default="Chat history with ??",
help="The custom headline for the HTML output. Use '??' as a placeholder for the chat name"
)
# Media handling
media_group = parser.add_argument_group('Media Handling')
media_group.add_argument(
@@ -166,7 +196,7 @@ def setup_argument_parser() -> ArgumentParser:
"--create-separated-media", dest="separate_media", default=False, action='store_true',
help="Create a copy of the media seperated per chat in <MEDIA>/separated/ directory"
)
# Filtering options
filter_group = parser.add_argument_group('Filtering Options')
filter_group.add_argument(
@@ -195,7 +225,7 @@ def setup_argument_parser() -> ArgumentParser:
"Setting this flag will cause the exporter to render those. "
"This is useful if chat(s) are missing from the output")
)
# Contact enrichment
contact_group = parser.add_argument_group('Contact Enrichment')
contact_group.add_argument(
@@ -206,7 +236,34 @@ def setup_argument_parser() -> ArgumentParser:
"--default-country-code", dest="default_country_code", default=None,
help="Use with --enrich-from-vcards. When numbers in the vcf file does not have a country code, this will be used. 1 is for US, 66 for Thailand etc. Most likely use the number of your own country"
)
# Incremental merging
inc_merging_group = parser.add_argument_group('Incremental Merging')
inc_merging_group.add_argument(
"--incremental-merge",
dest="incremental_merge",
default=False,
action='store_true',
help=("Performs an incremental merge of two exports. "
"Requires setting both --source-dir and --target-dir. "
"The chats (JSON files only) and media from the source directory will be merged into the target directory. "
"No chat messages or media will be deleted from the target directory; only new chat messages and media will be added to it. "
"This enables chat messages and media to be deleted from the device to free up space, while ensuring they are preserved in the exported backups."
)
)
inc_merging_group.add_argument(
"--source-dir",
dest="source_dir",
default=None,
help="Sets the source directory. Used for performing incremental merges."
)
inc_merging_group.add_argument(
"--target-dir",
dest="target_dir",
default=None,
help="Sets the target directory. Used for performing incremental merges."
)
# Miscellaneous
misc_group = parser.add_argument_group('Miscellaneous')
misc_group.add_argument(
@@ -217,6 +274,10 @@ def setup_argument_parser() -> ArgumentParser:
"--check-update", dest="check_update", default=False, action='store_true',
help="Check for updates (require Internet access)"
)
misc_group.add_argument(
"--check-update-pre", dest="check_update_pre", default=False, action='store_true',
help="Check for updates including pre-releases (require Internet access)"
)
misc_group.add_argument(
"--assume-first-as-me", dest="assume_first_as_me", default=False, action='store_true',
help="Assume the first message in a chat as sent by me (must be used together with -e)"
@@ -230,10 +291,18 @@ def setup_argument_parser() -> ArgumentParser:
help="Specify the chunk size for decrypting iOS backup, which may affect the decryption speed."
)
misc_group.add_argument(
"--max-bruteforce-worker", dest="max_bruteforce_worker", default=10, type=int,
"--max-bruteforce-worker", dest="max_bruteforce_worker", default=4, type=int,
help="Specify the maximum number of worker for bruteforce decryption."
)
misc_group.add_argument(
"--no-banner", dest="no_banner", default=False, action='store_true',
help="Do not show the banner"
)
misc_group.add_argument(
"--fix-dot-files", dest="fix_dot_files", default=False, action='store_true',
help="Fix files with a dot at the end of their name (allowing the outputs be stored in FAT filesystems)"
)
return parser
@@ -245,50 +314,60 @@ def validate_args(parser: ArgumentParser, args) -> None:
if not args.android and not args.ios and not args.exported and not args.import_json:
parser.error("You must define the device type.")
if args.no_html and not args.json and not args.text_format:
parser.error("You must either specify a JSON output file, text file output directory or enable HTML output.")
parser.error(
"You must either specify a JSON output file, text file output directory or enable HTML output.")
if args.import_json and (args.android or args.ios or args.exported or args.no_html):
parser.error("You can only use --import with -j and without --no-html, -a, -i, -e.")
parser.error(
"You can only use --import with -j and without --no-html, -a, -i, -e.")
elif args.import_json and not os.path.isfile(args.json):
parser.error("JSON file not found.")
if args.incremental_merge and (args.source_dir is None or args.target_dir is None):
parser.error(
"You must specify both --source-dir and --target-dir for incremental merge.")
if args.android and args.business:
parser.error("WhatsApp Business is only available on iOS for now.")
if "??" not in args.headline:
parser.error("--headline must contain '??' for replacement.")
# JSON validation
if args.json_per_chat and args.json and (
(args.json.endswith(".json") and os.path.isfile(args.json)) or
(args.json.endswith(".json") and os.path.isfile(args.json)) or
(not args.json.endswith(".json") and os.path.isfile(args.json))
):
parser.error("When --per-chat is enabled, the destination of --json must be a directory.")
parser.error(
"When --per-chat is enabled, the destination of --json must be a directory.")
# vCards validation
if args.enrich_from_vcards is not None and args.default_country_code is None:
parser.error("When --enrich-from-vcards is provided, you must also set --default-country-code")
# Size validation
if args.size is not None and not isinstance(args.size, int) and not args.size.isnumeric():
parser.error(
"When --enrich-from-vcards is provided, you must also set --default-country-code")
# Size validation and conversion
if args.size is not None:
try:
args.size = readable_to_bytes(args.size)
except ValueError:
parser.error("The value for --split must be ended in pure bytes or with a proper unit (e.g., 1048576 or 1MB)")
parser.error(
"The value for --split must be pure bytes or use a proper unit (e.g., 1048576 or 1MB)"
)
# Date filter validation and processing
if args.filter_date is not None:
process_date_filter(parser, args)
# Crypt15 key validation
if args.key is None and args.backup is not None and args.backup.endswith("crypt15"):
args.key = getpass("Enter your encryption key: ")
# Theme validation
if args.whatsapp_theme:
args.template = "whatsapp_new.html"
if args.telegram_theme:
args.template = "whatsapp_old.html"
# Chat filter validation
if args.filter_chat_include is not None and args.filter_chat_exclude is not None:
parser.error("Chat inclusion and exclusion filters cannot be used together.")
parser.error(
"Chat inclusion and exclusion filters cannot be used together.")
validate_chat_filters(parser, args.filter_chat_include)
validate_chat_filters(parser, args.filter_chat_exclude)
@@ -298,21 +377,24 @@ def validate_chat_filters(parser: ArgumentParser, chat_filter: Optional[List[str
if chat_filter is not None:
for chat in chat_filter:
if not chat.isnumeric():
parser.error("Enter a phone number in the chat filter. See https://wts.knugi.dev/docs?dest=chat")
parser.error(
"Enter a phone number in the chat filter. See https://wts.knugi.dev/docs?dest=chat")
def process_date_filter(parser: ArgumentParser, args) -> None:
"""Process and validate date filter arguments."""
if " - " in args.filter_date:
start, end = args.filter_date.split(" - ")
start = int(datetime.strptime(start, args.filter_date_format).timestamp())
start = int(datetime.strptime(
start, args.filter_date_format).timestamp())
end = int(datetime.strptime(end, args.filter_date_format).timestamp())
if start < 1009843200 or end < 1009843200:
parser.error("WhatsApp was first released in 2009...")
if start > end:
parser.error("The start date cannot be a moment after the end date.")
parser.error(
"The start date cannot be a moment after the end date.")
if args.android:
args.filter_date = f"BETWEEN {start}000 AND {end}000"
elif args.ios:
@@ -324,13 +406,15 @@ def process_date_filter(parser: ArgumentParser, args) -> None:
def process_single_date_filter(parser: ArgumentParser, args) -> None:
"""Process single date comparison filters."""
if len(args.filter_date) < 3:
parser.error("Unsupported date format. See https://wts.knugi.dev/docs?dest=date")
_timestamp = int(datetime.strptime(args.filter_date[2:], args.filter_date_format).timestamp())
parser.error(
"Unsupported date format. See https://wts.knugi.dev/docs?dest=date")
_timestamp = int(datetime.strptime(
args.filter_date[2:], args.filter_date_format).timestamp())
if _timestamp < 1009843200:
parser.error("WhatsApp was first released in 2009...")
if args.filter_date[:2] == "> ":
if args.android:
args.filter_date = f">= {_timestamp}000"
@@ -342,21 +426,16 @@ def process_single_date_filter(parser: ArgumentParser, args) -> None:
elif args.ios:
args.filter_date = f"<= {_timestamp - APPLE_TIME}"
else:
parser.error("Unsupported date format. See https://wts.knugi.dev/docs?dest=date")
parser.error(
"Unsupported date format. See https://wts.knugi.dev/docs?dest=date")
def setup_contact_store(args) -> Optional['ContactsFromVCards']:
"""Set up and return a contact store if needed."""
if args.enrich_from_vcards is not None:
if not vcards_deps_installed:
print(
"You don't have the dependency to enrich contacts with vCard.\n"
"Read more on how to deal with enriching contacts:\n"
"https://github.com/KnugiHK/Whatsapp-Chat-Exporter/blob/main/README.md#usage"
)
exit(1)
contact_store = ContactsFromVCards()
contact_store.load_vcf_file(args.enrich_from_vcards, args.default_country_code)
contact_store.load_vcf_file(
args.enrich_from_vcards, args.default_country_code)
return contact_store
return None
@@ -364,11 +443,11 @@ def setup_contact_store(args) -> Optional['ContactsFromVCards']:
def decrypt_android_backup(args) -> int:
"""Decrypt Android backup files and return error code."""
if args.key is None or args.backup is None:
print("You must specify the backup file with -b and a key with -k")
logging.error(f"You must specify the backup file with -b and a key with -k")
return 1
print("Decryption key specified, decrypting WhatsApp backup...")
logging.info(f"Decryption key specified, decrypting WhatsApp backup...")
# Determine crypt type
if "crypt12" in args.backup:
crypt = Crypt.CRYPT12
@@ -377,9 +456,10 @@ def decrypt_android_backup(args) -> int:
elif "crypt15" in args.backup:
crypt = Crypt.CRYPT15
else:
print("Unknown backup format. The backup file must be crypt12, crypt14 or crypt15.")
logging.error(
f"Unknown backup format. The backup file must be crypt12, crypt14 or crypt15.")
return 1
# Get key
keyfile_stream = False
if not os.path.isfile(args.key) and all(char in string.hexdigits for char in args.key.replace(" ", "")):
@@ -387,10 +467,10 @@ def decrypt_android_backup(args) -> int:
else:
key = open(args.key, "rb")
keyfile_stream = True
# Read backup
db = open(args.backup, "rb").read()
# Process WAB if provided
error_wa = 0
if args.wab:
@@ -407,7 +487,7 @@ def decrypt_android_backup(args) -> int:
)
if isinstance(key, io.IOBase):
key.seek(0)
# Decrypt message database
error_message = android_crypt.decrypt_backup(
db,
@@ -419,7 +499,7 @@ def decrypt_android_backup(args) -> int:
keyfile_stream=keyfile_stream,
max_worker=args.max_bruteforce_worker
)
# Handle errors
if error_wa != 0:
return error_wa
@@ -429,25 +509,26 @@ def decrypt_android_backup(args) -> int:
def handle_decrypt_error(error: int) -> None:
"""Handle decryption errors with appropriate messages."""
if error == 1:
print("Dependencies of decrypt_backup and/or extract_encrypted_key"
" are not present. For details, see README.md.")
logging.error("Dependencies of decrypt_backup and/or extract_encrypted_key"
" are not present. For details, see README.md.")
exit(3)
elif error == 2:
print("Failed when decompressing the decrypted backup. "
"Possibly incorrect offsets used in decryption.")
logging.error("Failed when decompressing the decrypted backup. "
"Possibly incorrect offsets used in decryption.")
exit(4)
else:
print("Unknown error occurred.", error)
logging.error("Unknown error occurred.")
exit(5)
def process_contacts(args, data: ChatCollection, contact_store=None) -> None:
def process_contacts(args, data: ChatCollection) -> None:
"""Process contacts from the database."""
contact_db = args.wa if args.wa else "wa.db" if args.android else "ContactsV2.sqlite"
if os.path.isfile(contact_db):
with sqlite3.connect(contact_db) as db:
db.row_factory = sqlite3.Row
db.text_factory = lambda b: b.decode(encoding="utf-8", errors="replace")
if args.android:
android_handler.contacts(db, data, args.enrich_from_vcards)
else:
@@ -457,84 +538,88 @@ def process_contacts(args, data: ChatCollection, contact_store=None) -> None:
def process_messages(args, data: ChatCollection) -> None:
"""Process messages, media and vcards from the database."""
msg_db = args.db if args.db else "msgstore.db" if args.android else args.identifiers.MESSAGE
if not os.path.isfile(msg_db):
print(
logging.error(
"The message database does not exist. You may specify the path "
"to database file with option -d or check your provided path."
)
exit(6)
filter_chat = (args.filter_chat_include, args.filter_chat_exclude)
timing = Timing(args.timezone_offset if args.timezone_offset else CURRENT_TZ_OFFSET)
with sqlite3.connect(msg_db) as db:
db.row_factory = sqlite3.Row
db.text_factory = lambda b: b.decode(encoding="utf-8", errors="replace")
# Process messages
if args.android:
message_handler = android_handler
data.set_system("jid_map_exists", check_jid_map(db))
data.set_system("transcription_selection", get_transcription_selection(db))
else:
message_handler = ios_handler
message_handler.messages(
db, data, args.media, args.timezone_offset,
args.filter_date, filter_chat, args.filter_empty
db, data, args.media, timing, args.filter_date,
filter_chat, args.filter_empty, args.no_reply_ios
)
# Process media
message_handler.media(
db, data, args.media, args.filter_date,
filter_chat, args.filter_empty, args.separate_media
db, data, args.media, args.filter_date,
filter_chat, args.filter_empty, args.separate_media, args.fix_dot_files
)
# Process vcards
message_handler.vcard(
db, data, args.media, args.filter_date,
db, data, args.media, args.filter_date,
filter_chat, args.filter_empty
)
# Process calls
process_calls(args, db, data, filter_chat)
process_calls(args, db, data, filter_chat, timing)
def process_calls(args, db, data: ChatCollection, filter_chat) -> None:
def process_calls(args, db, data: ChatCollection, filter_chat, timing) -> None:
"""Process call history if available."""
if args.android:
android_handler.calls(db, data, args.timezone_offset, filter_chat)
android_handler.calls(db, data, timing, filter_chat)
elif args.ios and args.call_db_ios is not None:
with sqlite3.connect(args.call_db_ios) as cdb:
cdb.row_factory = sqlite3.Row
ios_handler.calls(cdb, data, args.timezone_offset, filter_chat)
cdb.text_factory = lambda b: b.decode(encoding="utf-8", errors="replace")
ios_handler.calls(cdb, data, timing, filter_chat)
def handle_media_directory(args) -> None:
"""Handle media directory copying or moving."""
if os.path.isdir(args.media):
media_path = os.path.join(args.output, args.media)
if os.path.isdir(media_path):
print("\nWhatsApp directory already exists in output directory. Skipping...", end="\n")
logging.info(
f"WhatsApp directory already exists in output directory. Skipping...")
else:
if args.move_media:
try:
print("\nMoving media directory...", end="\n")
logging.info(f"Moving media directory...", extra={"clear": True})
shutil.move(args.media, f"{args.output}/")
logging.info(f"Media directory has been moved to the output directory")
except PermissionError:
print("\nCannot remove original WhatsApp directory. "
"Perhaps the directory is opened?", end="\n")
logging.warning("Cannot remove original WhatsApp directory. "
"Perhaps the directory is opened?")
else:
print("\nCopying media directory...", end="\n")
logging.info(f"Copying media directory...", extra={"clear": True})
shutil.copytree(args.media, media_path)
logging.info(f"Media directory has been copied to the output directory")
def create_output_files(args, data: ChatCollection, contact_store=None) -> None:
def create_output_files(args, data: ChatCollection) -> None:
"""Create output files in the specified formats."""
# Create HTML files if requested
if not args.no_html:
# Enrich from vcards if available
if contact_store and not contact_store.is_empty():
contact_store.enrich_from_vcards(data)
android_handler.create_html(
data,
args.output,
@@ -543,32 +628,29 @@ def create_output_files(args, data: ChatCollection, contact_store=None) -> None:
args.offline,
args.size,
args.no_avatar,
args.whatsapp_theme,
args.telegram_theme,
args.headline
)
# Create text files if requested
if args.text_format:
print("Writing text file...")
logging.info(f"Writing text file...")
android_handler.create_txt(data, args.text_format)
# Create JSON files if requested
if args.json and not args.import_json:
export_json(args, data, contact_store)
export_json(args, data)
def export_json(args, data: ChatCollection, contact_store=None) -> None:
def export_json(args, data: ChatCollection) -> None:
"""Export data to JSON format."""
# Enrich from vcards if available
if contact_store and not contact_store.is_empty():
contact_store.enrich_from_vcards(data)
# TODO: remove all non-target chats from data if filtering is applied?
# Convert ChatStore objects to JSON
if isinstance(data.get(next(iter(data), None)), ChatStore):
data = {jik: chat.to_json() for jik, chat in data.items()}
# Export as a single file or per chat
if not args.json_per_chat:
if not args.json_per_chat and not args.telegram:
export_single_json(args, data)
else:
export_multiple_json(args, data)
@@ -582,42 +664,49 @@ def export_single_json(args, data: Dict) -> None:
ensure_ascii=not args.avoid_encoding_json,
indent=args.pretty_print_json
)
print(f"\nWriting JSON file...({bytes_to_readable(len(json_data))})")
logging.info(f"Writing JSON file...", extra={"clear": True})
f.write(json_data)
logging.info(f"JSON file saved...({bytes_to_readable(len(json_data))})")
def export_multiple_json(args, data: Dict) -> None:
"""Export data to multiple JSON files, one per chat."""
# Adjust output path if needed
json_path = args.json[:-5] if args.json.endswith(".json") else args.json
# Create directory if it doesn't exist
if not os.path.isdir(json_path):
os.makedirs(json_path, exist_ok=True)
# Export each chat
total = len(data.keys())
for index, jik in enumerate(data.keys()):
if data[jik]["name"] is not None:
contact = data[jik]["name"].replace('/', '')
else:
contact = jik.replace('+', '')
with open(f"{json_path}/{sanitize_filename(contact)}.json", "w") as f:
file_content = json.dumps(
{jik: data[jik]},
ensure_ascii=not args.avoid_encoding_json,
indent=args.pretty_print_json
)
f.write(file_content)
print(f"Writing JSON file...({index + 1}/{total})", end="\r")
print()
with tqdm(total=total, desc="Generating JSON files", unit="file", leave=False) as pbar:
for jik in data.keys():
if data[jik]["name"] is not None:
contact = data[jik]["name"].replace('/', '')
else:
contact = jik.replace('+', '')
if args.telegram:
messages = telegram_json_format(jik, data[jik], args.timezone_offset)
else:
messages = {jik: data[jik]}
with open(f"{json_path}/{safe_name(contact)}.json", "w") as f:
file_content = json.dumps(
messages,
ensure_ascii=not args.avoid_encoding_json,
indent=args.pretty_print_json
)
f.write(file_content)
pbar.update(1)
total_time = pbar.format_dict['elapsed']
logging.info(f"Generated {total} JSON files in {convert_time_unit(total_time)}")
def process_exported_chat(args, data: ChatCollection) -> None:
"""Process an exported chat file."""
exported_handler.messages(args.exported, data, args.assume_first_as_me)
if not args.no_html:
android_handler.create_html(
data,
@@ -627,37 +716,87 @@ def process_exported_chat(args, data: ChatCollection) -> None:
args.offline,
args.size,
args.no_avatar,
args.whatsapp_theme,
args.telegram_theme,
args.headline
)
# Copy files to output directory
for file in glob.glob(r'*.*'):
shutil.copy(file, args.output)
class ClearLineFilter(logging.Filter):
def filter(self, record):
is_clear = getattr(record, 'clear', False)
if is_clear:
record.line_end = "\r"
record.prefix = "\x1b[K"
else:
record.line_end = "\n"
record.prefix = ""
return True
def setup_logging(level):
log_handler_stdout = logging.StreamHandler()
log_handler_stdout.terminator = ""
log_handler_stdout.addFilter(ClearLineFilter())
log_handler_stdout.set_name("console")
handlers = [log_handler_stdout]
if level == logging.DEBUG:
timestamp = datetime.now().strftime("%Y%m%d-%H%M%S")
log_handler_file = logging.FileHandler(f"wtsexpoter-debug-{timestamp}.log", mode="w")
log_handler_file.terminator = ""
log_handler_file.addFilter(ClearLineFilter())
handlers.append(log_handler_file)
logging.basicConfig(
level=level,
format="[%(levelname)s] %(message)s%(line_end)s",
handlers=handlers
)
def main():
"""Main function to run the WhatsApp Chat Exporter."""
# Set up and parse arguments
parser = setup_argument_parser()
args = parser.parse_args()
# Print banner if not suppressed
if not args.no_banner:
# Note: This may raise UnicodeEncodeError on Windows if the terminal
# doesn't support UTF-8 (e.g., Legacy CMD). Use a modern terminal
# or set PYTHONUTF8=1 in your environment.
print(WTSEXPORTER_BANNER)
if args.debug:
setup_logging(logging.DEBUG)
logging.debug("Debug mode enabled.")
for handler in logging.getLogger().handlers:
if handler.name == "console":
handler.setLevel(logging.INFO)
else:
setup_logging(logging.INFO)
# Check for updates
if args.check_update:
exit(check_update())
if args.check_update or args.check_update_pre:
exit(check_update(args.check_update_pre))
# Validate arguments
validate_args(parser, args)
# Create output directory if it doesn't exist
os.makedirs(args.output, exist_ok=True)
# Initialize data collection
data = ChatCollection()
# Set up contact store for vCard enrichment if needed
contact_store = setup_contact_store(args)
if args.import_json:
# Import from JSON
import_from_json(args.json, data)
@@ -669,7 +808,7 @@ def main():
args.offline,
args.size,
args.no_avatar,
args.whatsapp_theme,
args.telegram_theme,
args.headline
)
elif args.exported:
@@ -681,13 +820,13 @@ def main():
# Set default media path if not provided
if args.media is None:
args.media = "WhatsApp"
# Set default DB paths if not provided
if args.db is None:
args.db = "msgstore.db"
if args.wa is None:
args.wa = "wa.db"
# Decrypt backup if needed
if args.key is not None:
error = decrypt_android_backup(args)
@@ -700,34 +839,54 @@ def main():
else:
from Whatsapp_Chat_Exporter.utility import WhatsAppIdentifier as identifiers
args.identifiers = identifiers
# Set default media path if not provided
if args.media is None:
args.media = identifiers.DOMAIN
# Extract media from backup if needed
if args.backup is not None:
if not os.path.isdir(args.media):
ios_media_handler.extract_media(args.backup, identifiers, args.decrypt_chunk_size)
ios_media_handler.extract_media(
args.backup, identifiers, args.decrypt_chunk_size)
else:
print("WhatsApp directory already exists, skipping WhatsApp file extraction.")
logging.info(
f"WhatsApp directory already exists, skipping WhatsApp file extraction.")
# Set default DB paths if not provided
if args.db is None:
args.db = identifiers.MESSAGE
if args.wa is None:
args.wa = "ContactsV2.sqlite"
# Process contacts
process_contacts(args, data, contact_store)
# Process messages, media, and calls
process_messages(args, data)
# Create output files
create_output_files(args, data, contact_store)
# Handle media directory
handle_media_directory(args)
print("Everything is done!")
if args.incremental_merge:
incremental_merge(
args.source_dir,
args.target_dir,
args.media,
args.pretty_print_json,
args.avoid_encoding_json
)
logging.info(f"Incremental merge completed successfully.")
else:
# Process contacts
process_contacts(args, data)
# Enrich contacts from vCards if needed
if args.android and contact_store and not contact_store.is_empty():
contact_store.enrich_from_vcards(data)
# Process messages, media, and calls
process_messages(args, data)
# Create output files
create_output_files(args, data)
# Handle media directory
handle_media_directory(args)
logging.info("Everything is done!")
if __name__ == "__main__":
main()

View File

@@ -1,10 +1,12 @@
import hmac
import io
import logging
import zlib
import concurrent.futures
from tqdm import tqdm
from typing import Tuple, Union
from hashlib import sha256
from sys import exit
from functools import partial
from Whatsapp_Chat_Exporter.utility import CRYPT14_OFFSETS, Crypt, DbType
try:
@@ -23,6 +25,8 @@ else:
support_crypt15 = True
class DecryptionError(Exception):
"""Base class for decryption-related exceptions."""
pass
@@ -106,15 +110,39 @@ def _decrypt_database(db_ciphertext: bytes, main_key: bytes, iv: bytes) -> bytes
zlib.error: If decompression fails.
ValueError: if the plaintext is not a SQLite database.
"""
FOOTER_SIZE = 32
if len(db_ciphertext) <= FOOTER_SIZE:
raise ValueError("Input data too short to contain a valid GCM tag.")
actual_ciphertext = db_ciphertext[:-FOOTER_SIZE]
tag = db_ciphertext[-FOOTER_SIZE: -FOOTER_SIZE + 16]
cipher = AES.new(main_key, AES.MODE_GCM, iv)
db_compressed = cipher.decrypt(db_ciphertext)
db = zlib.decompress(db_compressed)
if db[0:6].upper() != b"SQLITE":
try:
db_compressed = cipher.decrypt_and_verify(actual_ciphertext, tag)
except ValueError:
# This could be key, IV, or tag is wrong, but likely the key is wrong.
raise ValueError("Decryption/Authentication failed. Ensure you are using the correct key.")
if len(db_compressed) < 2 or db_compressed[0] != 0x78:
logging.debug(f"Data passes GCM but is not Zlib. Header: {db_compressed[:2].hex()}")
raise ValueError(
"The plaintext is not a SQLite database. Ensure you are using the correct key."
"Key is correct, but decrypted data is not a valid compressed stream. "
"Is this even a valid WhatsApp database backup?"
)
try:
db = zlib.decompress(db_compressed)
except zlib.error as e:
raise zlib.error(f"Decompression failed (The backup file likely corrupted at source): {e}")
if not db.startswith(b"SQLite"):
raise ValueError(
"Data is valid and decompressed, but it is not a SQLite database. "
"Is this even a valid WhatsApp database backup?")
return db
def _decrypt_crypt14(database: bytes, main_key: bytes, max_worker: int = 10) -> bytes:
"""Decrypt a crypt14 database using multithreading for brute-force offset detection.
@@ -135,55 +163,68 @@ def _decrypt_crypt14(database: bytes, main_key: bytes, max_worker: int = 10) ->
# Attempt known offsets first
for offsets in CRYPT14_OFFSETS:
iv = database[offsets["iv"]:offsets["iv"] + 16]
db_ciphertext = database[offsets["db"]:]
iv = offsets["iv"]
db = offsets["db"]
try:
return _decrypt_database(db_ciphertext, main_key, iv)
decrypted_db = _attempt_decrypt_task((iv, iv + 16, db), database, main_key)
except (zlib.error, ValueError):
pass # Try next offset
print("Common offsets failed. Initiating brute-force with multithreading...")
# Convert brute force generator into a list for parallel processing
offset_combinations = list(brute_force_offset())
def attempt_decrypt(offset_tuple):
"""Attempt decryption with the given offsets."""
start_iv, end_iv, start_db = offset_tuple
iv = database[start_iv:end_iv]
db_ciphertext = database[start_db:]
try:
db = _decrypt_database(db_ciphertext, main_key, iv)
print(
f"The offsets of your IV and database are {start_iv} and "
f"{start_db}, respectively. To include your offsets in the "
"program, please report it by creating an issue on GitHub: "
"https://github.com/KnugiHK/Whatsapp-Chat-Exporter/discussions/47"
"\nShutting down other threads..."
continue
else:
logging.debug(
f"Decryption successful with known offsets: IV {iv}, DB {db}"
)
return db
except (zlib.error, ValueError):
return None # Decryption failed, move to next
return decrypted_db # Successful decryption
with concurrent.futures.ThreadPoolExecutor(max_worker) as executor:
future_to_offset = {executor.submit(attempt_decrypt, offset): offset for offset in offset_combinations}
try:
for future in concurrent.futures.as_completed(future_to_offset):
result = future.result()
if result is not None:
# Shutdown remaining threads
logging.info(f"Common offsets failed. Will attempt to brute-force")
offset_max = 200
workers = max_worker
check_offset = partial(_attempt_decrypt_task, database=database, main_key=main_key)
all_offsets = list(brute_force_offset(offset_max, offset_max))
executor = concurrent.futures.ProcessPoolExecutor(max_workers=workers)
try:
with tqdm(total=len(all_offsets), desc="Brute-forcing offsets", unit="trial", leave=False) as pbar:
results = executor.map(check_offset, all_offsets, chunksize=8)
found = False
for offset_info, result in zip(all_offsets, results):
pbar.update(1)
if result:
start_iv, _, start_db = offset_info
# Clean shutdown on success
executor.shutdown(wait=False, cancel_futures=True)
return result
found = True
break
if found:
logging.info(
f"The offsets of your IV and database are {start_iv} and {start_db}, respectively."
)
logging.info(
f"To include your offsets in the expoter, please report it in the discussion thread on GitHub:"
)
logging.info(f"https://github.com/KnugiHK/Whatsapp-Chat-Exporter/discussions/47")
return result
except KeyboardInterrupt:
print("\nBrute force interrupted by user (Ctrl+C). Exiting gracefully...")
executor.shutdown(wait=False, cancel_futures=True)
exit(1)
except KeyboardInterrupt:
executor.shutdown(wait=False, cancel_futures=True)
logging.info("")
raise KeyboardInterrupt(
f"Brute force interrupted by user (Ctrl+C). Shutting down gracefully..."
)
finally:
executor.shutdown(wait=False)
raise OffsetNotFoundError("Could not find the correct offsets for decryption.")
def _attempt_decrypt_task(offset_tuple, database, main_key):
"""Attempt decryption with the given offsets."""
start_iv, end_iv, start_db = offset_tuple
iv = database[start_iv:end_iv]
db_ciphertext = database[start_db:]
try:
return _decrypt_database(db_ciphertext, main_key, iv)
except (zlib.error, ValueError):
return None
def _decrypt_crypt12(database: bytes, main_key: bytes) -> bytes:
@@ -287,7 +328,7 @@ def decrypt_backup(
if crypt is not Crypt.CRYPT15 and len(key) != 158:
raise InvalidKeyError("The key file must be 158 bytes")
#signature check, this is check is used in crypt 12 and 14
# signature check, this is check is used in crypt 12 and 14
if crypt != Crypt.CRYPT15:
t1 = key[30:62]
@@ -297,7 +338,6 @@ def decrypt_backup(
if t1 != database[3:35] and crypt == Crypt.CRYPT12:
raise ValueError("The signature of key file and backup file mismatch")
if crypt == Crypt.CRYPT15:
if keyfile_stream:
main_key, hex_key = _extract_enc_key(key)
@@ -305,7 +345,7 @@ def decrypt_backup(
main_key, hex_key = _derive_main_enc_key(key)
if show_crypt15:
hex_key_str = ' '.join([hex_key.hex()[c:c+4] for c in range(0, len(hex_key.hex()), 4)])
print(f"The HEX key of the crypt15 backup is: {hex_key_str}")
logging.info(f"The HEX key of the crypt15 backup is: {hex_key_str}")
else:
main_key = key[126:]
@@ -321,7 +361,6 @@ def decrypt_backup(
except (InvalidFileFormatError, OffsetNotFoundError, ValueError) as e:
raise DecryptionError(f"Decryption failed: {e}") from e
if not dry_run:
with open(output, "wb") as f:
f.write(db)

File diff suppressed because it is too large Load Diff

View File

@@ -24,51 +24,19 @@ import struct
import codecs
from datetime import datetime, timedelta
class BPListWriter(object):
def __init__(self, objects):
self.bplist = ""
self.objects = objects
def binary(self):
'''binary -> string
Generates bplist
'''
self.data = 'bplist00'
# TODO: flatten objects and count max length size
# TODO: write objects and save offsets
# TODO: write offsets
# TODO: write metadata
return self.data
def write(self, filename):
'''
Writes bplist to file
'''
if self.bplist != "":
pass
# TODO: save self.bplist to file
else:
raise Exception('BPlist not yet generated')
class BPListReader(object):
def __init__(self, s):
self.data = s
self.objects = []
self.resolved = {}
def __unpackIntStruct(self, sz, s):
'''__unpackIntStruct(size, string) -> int
Unpacks the integer of given size (1, 2 or 4 bytes) from string
'''
if sz == 1:
if sz == 1:
ot = '!B'
elif sz == 2:
ot = '!H'
@@ -79,17 +47,17 @@ class BPListReader(object):
else:
raise Exception('int unpack size '+str(sz)+' unsupported')
return struct.unpack(ot, s)[0]
def __unpackInt(self, offset):
'''__unpackInt(offset) -> int
Unpacks int field from plist at given offset
'''
return self.__unpackIntMeta(offset)[1]
def __unpackIntMeta(self, offset):
'''__unpackIntMeta(offset) -> (size, int)
Unpacks int field from plist at given offset and returns its size and value
'''
obj_header = self.data[offset]
@@ -99,7 +67,7 @@ class BPListReader(object):
def __resolveIntSize(self, obj_info, offset):
'''__resolveIntSize(obj_info, offset) -> (count, offset)
Calculates count of objref* array entries and returns count and offset to first element
'''
if obj_info == 0x0F:
@@ -112,10 +80,10 @@ class BPListReader(object):
def __unpackFloatStruct(self, sz, s):
'''__unpackFloatStruct(size, string) -> float
Unpacks the float of given size (4 or 8 bytes) from string
'''
if sz == 4:
if sz == 4:
ot = '!f'
elif sz == 8:
ot = '!d'
@@ -125,7 +93,7 @@ class BPListReader(object):
def __unpackFloat(self, offset):
'''__unpackFloat(offset) -> float
Unpacks float field from plist at given offset
'''
obj_header = self.data[offset]
@@ -135,70 +103,79 @@ class BPListReader(object):
def __unpackDate(self, offset):
td = int(struct.unpack(">d", self.data[offset+1:offset+9])[0])
return datetime(year=2001,month=1,day=1) + timedelta(seconds=td)
return datetime(year=2001, month=1, day=1) + timedelta(seconds=td)
def __unpackItem(self, offset):
'''__unpackItem(offset)
Unpacks and returns an item from plist
'''
obj_header = self.data[offset]
obj_type, obj_info = (obj_header & 0xF0), (obj_header & 0x0F)
if obj_type == 0x00:
if obj_info == 0x00: # null 0000 0000
if obj_type == 0x00:
if obj_info == 0x00: # null 0000 0000
return None
elif obj_info == 0x08: # bool 0000 1000 // false
elif obj_info == 0x08: # bool 0000 1000 // false
return False
elif obj_info == 0x09: # bool 0000 1001 // true
elif obj_info == 0x09: # bool 0000 1001 // true
return True
elif obj_info == 0x0F: # fill 0000 1111 // fill byte
raise Exception("0x0F Not Implemented") # this is really pad byte, FIXME
elif obj_info == 0x0F: # fill 0000 1111 // fill byte
raise Exception("0x0F Not Implemented") # this is really pad byte, FIXME
else:
raise Exception('unpack item type '+str(obj_header)+' at '+str(offset)+ 'failed')
elif obj_type == 0x10: # int 0001 nnnn ... // # of bytes is 2^nnnn, big-endian bytes
raise Exception('unpack item type '+str(obj_header)+' at '+str(offset) + 'failed')
elif obj_type == 0x10: # int 0001 nnnn ... // # of bytes is 2^nnnn, big-endian bytes
return self.__unpackInt(offset)
elif obj_type == 0x20: # real 0010 nnnn ... // # of bytes is 2^nnnn, big-endian bytes
elif obj_type == 0x20: # real 0010 nnnn ... // # of bytes is 2^nnnn, big-endian bytes
return self.__unpackFloat(offset)
elif obj_type == 0x30: # date 0011 0011 ... // 8 byte float follows, big-endian bytes
elif obj_type == 0x30: # date 0011 0011 ... // 8 byte float follows, big-endian bytes
return self.__unpackDate(offset)
elif obj_type == 0x40: # data 0100 nnnn [int] ... // nnnn is number of bytes unless 1111 then int count follows, followed by bytes
# data 0100 nnnn [int] ... // nnnn is number of bytes unless 1111 then int count follows, followed by bytes
elif obj_type == 0x40:
obj_count, objref = self.__resolveIntSize(obj_info, offset)
return self.data[objref:objref+obj_count] # XXX: we return data as str
elif obj_type == 0x50: # string 0101 nnnn [int] ... // ASCII string, nnnn is # of chars, else 1111 then int count, then bytes
return self.data[objref:objref+obj_count] # XXX: we return data as str
# string 0101 nnnn [int] ... // ASCII string, nnnn is # of chars, else 1111 then int count, then bytes
elif obj_type == 0x50:
obj_count, objref = self.__resolveIntSize(obj_info, offset)
return self.data[objref:objref+obj_count]
elif obj_type == 0x60: # string 0110 nnnn [int] ... // Unicode string, nnnn is # of chars, else 1111 then int count, then big-endian 2-byte uint16_t
# string 0110 nnnn [int] ... // Unicode string, nnnn is # of chars, else 1111 then int count, then big-endian 2-byte uint16_t
elif obj_type == 0x60:
obj_count, objref = self.__resolveIntSize(obj_info, offset)
return self.data[objref:objref+obj_count*2].decode('utf-16be')
elif obj_type == 0x80: # uid 1000 nnnn ... // nnnn+1 is # of bytes
elif obj_type == 0x80: # uid 1000 nnnn ... // nnnn+1 is # of bytes
# FIXME: Accept as a string for now
obj_count, objref = self.__resolveIntSize(obj_info, offset)
return self.data[objref:objref+obj_count]
elif obj_type == 0xA0: # array 1010 nnnn [int] objref* // nnnn is count, unless '1111', then int count follows
# array 1010 nnnn [int] objref* // nnnn is count, unless '1111', then int count follows
elif obj_type == 0xA0:
obj_count, objref = self.__resolveIntSize(obj_info, offset)
arr = []
for i in range(obj_count):
arr.append(self.__unpackIntStruct(self.object_ref_size, self.data[objref+i*self.object_ref_size:objref+i*self.object_ref_size+self.object_ref_size]))
arr.append(self.__unpackIntStruct(
self.object_ref_size, self.data[objref+i*self.object_ref_size:objref+i*self.object_ref_size+self.object_ref_size]))
return arr
elif obj_type == 0xC0: # set 1100 nnnn [int] objref* // nnnn is count, unless '1111', then int count follows
# set 1100 nnnn [int] objref* // nnnn is count, unless '1111', then int count follows
elif obj_type == 0xC0:
# XXX: not serializable via apple implementation
raise Exception("0xC0 Not Implemented") # FIXME: implement
elif obj_type == 0xD0: # dict 1101 nnnn [int] keyref* objref* // nnnn is count, unless '1111', then int count follows
raise Exception("0xC0 Not Implemented") # FIXME: implement
# dict 1101 nnnn [int] keyref* objref* // nnnn is count, unless '1111', then int count follows
elif obj_type == 0xD0:
obj_count, objref = self.__resolveIntSize(obj_info, offset)
keys = []
for i in range(obj_count):
keys.append(self.__unpackIntStruct(self.object_ref_size, self.data[objref+i*self.object_ref_size:objref+i*self.object_ref_size+self.object_ref_size]))
keys.append(self.__unpackIntStruct(
self.object_ref_size, self.data[objref+i*self.object_ref_size:objref+i*self.object_ref_size+self.object_ref_size]))
values = []
objref += obj_count*self.object_ref_size
for i in range(obj_count):
values.append(self.__unpackIntStruct(self.object_ref_size, self.data[objref+i*self.object_ref_size:objref+i*self.object_ref_size+self.object_ref_size]))
values.append(self.__unpackIntStruct(
self.object_ref_size, self.data[objref+i*self.object_ref_size:objref+i*self.object_ref_size+self.object_ref_size]))
dic = {}
for i in range(obj_count):
dic[keys[i]] = values[i]
return dic
else:
raise Exception('don\'t know how to unpack obj type '+hex(obj_type)+' at '+str(offset))
def __resolveObject(self, idx):
try:
return self.resolved[idx]
@@ -212,7 +189,7 @@ class BPListReader(object):
return newArr
if type(obj) == dict:
newDic = {}
for k,v in obj.items():
for k, v in obj.items():
key_resolved = self.__resolveObject(k)
if isinstance(key_resolved, str):
rk = key_resolved
@@ -225,15 +202,16 @@ class BPListReader(object):
else:
self.resolved[idx] = obj
return obj
def parse(self):
# read header
if self.data[:8] != b'bplist00':
raise Exception('Bad magic')
# read trailer
self.offset_size, self.object_ref_size, self.number_of_objects, self.top_object, self.table_offset = struct.unpack('!6xBB4xI4xI4xI', self.data[-32:])
#print "** plist offset_size:",self.offset_size,"objref_size:",self.object_ref_size,"num_objs:",self.number_of_objects,"top:",self.top_object,"table_ofs:",self.table_offset
self.offset_size, self.object_ref_size, self.number_of_objects, self.top_object, self.table_offset = struct.unpack(
'!6xBB4xI4xI4xI', self.data[-32:])
# print "** plist offset_size:",self.offset_size,"objref_size:",self.object_ref_size,"num_objs:",self.number_of_objects,"top:",self.top_object,"table_ofs:",self.table_offset
# read offset table
self.offset_table = self.data[self.table_offset:-32]
@@ -243,50 +221,25 @@ class BPListReader(object):
offset_entry = ot[:self.offset_size]
ot = ot[self.offset_size:]
self.offsets.append(self.__unpackIntStruct(self.offset_size, offset_entry))
#print "** plist offsets:",self.offsets
# print "** plist offsets:",self.offsets
# read object table
self.objects = []
k = 0
for i in self.offsets:
obj = self.__unpackItem(i)
#print "** plist unpacked",k,type(obj),obj,"at",i
# print "** plist unpacked",k,type(obj),obj,"at",i
k += 1
self.objects.append(obj)
# rebuild object tree
#for i in range(len(self.objects)):
# for i in range(len(self.objects)):
# self.__resolveObject(i)
# return root object
return self.__resolveObject(self.top_object)
@classmethod
def plistWithString(cls, s):
parser = cls(s)
return parser.parse()
# helpers for testing
def plist(obj):
from Foundation import NSPropertyListSerialization, NSPropertyListBinaryFormat_v1_0
b = NSPropertyListSerialization.dataWithPropertyList_format_options_error_(obj, NSPropertyListBinaryFormat_v1_0, 0, None)
return str(b.bytes())
def unplist(s):
from Foundation import NSData, NSPropertyListSerialization
d = NSData.dataWithBytes_length_(s, len(s))
return NSPropertyListSerialization.propertyListWithData_options_format_error_(d, 0, None, None)
if __name__ == "__main__":
import os
import sys
import json
file_path = sys.argv[1]
with open(file_path, "rb") as fp:
data = fp.read()
out = BPListReader(data).parse()
with open(file_path + ".json", "w") as fp:
json.dump(out, indent=4)

View File

@@ -7,6 +7,7 @@ class Timing:
"""
Handles timestamp formatting with timezone support.
"""
def __init__(self, timezone_offset: Optional[int]) -> None:
"""
Initialize Timing object.
@@ -27,7 +28,7 @@ class Timing:
Returns:
Optional[str]: Formatted timestamp string, or None if timestamp is None
"""
if timestamp:
if timestamp is not None:
timestamp = timestamp / 1000 if timestamp > 9999999999 else timestamp
return datetime.fromtimestamp(timestamp, TimeZone(self.timezone_offset)).strftime(format)
return None
@@ -37,6 +38,7 @@ class TimeZone(tzinfo):
"""
Custom timezone class with fixed offset.
"""
def __init__(self, offset: int) -> None:
"""
Initialize TimeZone object.
@@ -64,6 +66,7 @@ class ChatCollection(MutableMapping):
def __init__(self) -> None:
"""Initialize an empty chat collection."""
self._chats: Dict[str, ChatStore] = {}
self._system: Dict[str, Any] = {}
def __getitem__(self, key: str) -> 'ChatStore':
"""Get a chat by its ID. Required for dict-like access."""
@@ -146,11 +149,34 @@ class ChatCollection(MutableMapping):
"""
return {chat_id: chat.to_json() for chat_id, chat in self._chats.items()}
def get_system(self, key: str) -> Any:
"""
Get a system value by its key.
Args:
key (str): The key of the system value to retrieve
Returns:
Any: The system value if found, None otherwise
"""
return self._system.get(key)
def set_system(self, key: str, value: Any) -> None:
"""
Set a system value by its key.
Args:
key (str): The key of the system value to set
value (Any): The value to set
"""
self._system[key] = value
class ChatStore:
"""
Stores chat information and messages.
"""
def __init__(self, type: str, name: Optional[str] = None, media: Optional[str] = None) -> None:
"""
Initialize ChatStore object.
@@ -159,7 +185,7 @@ class ChatStore:
type (str): Device type (IOS or ANDROID)
name (Optional[str]): Chat name
media (Optional[str]): Path to media folder
Raises:
TypeError: If name is not a string or None
"""
@@ -182,7 +208,7 @@ class ChatStore:
self.their_avatar_thumb = None
self.status = None
self.media_base = ""
def __len__(self) -> int:
"""Get number of chats. Required for dict-like access."""
return len(self._messages)
@@ -192,7 +218,7 @@ class ChatStore:
if not isinstance(message, Message):
raise TypeError("message must be a Message object")
self._messages[id] = message
def get_message(self, id: str) -> 'Message':
"""Get a message from the chat store."""
return self._messages.get(id)
@@ -204,20 +230,30 @@ class ChatStore:
def to_json(self) -> Dict[str, Any]:
"""Convert chat store to JSON-serializable dict."""
return {
'name': self.name,
'type': self.type,
'my_avatar': self.my_avatar,
'their_avatar': self.their_avatar,
'their_avatar_thumb': self.their_avatar_thumb,
'status': self.status,
'messages': {id: msg.to_json() for id, msg in self._messages.items()}
json_dict = {
key: value
for key, value in self.__dict__.items()
if key != '_messages'
}
json_dict['messages'] = {id: msg.to_json() for id, msg in self._messages.items()}
return json_dict
@classmethod
def from_json(cls, data: Dict) -> 'ChatStore':
"""Create a chat store from JSON data."""
chat = cls(data.get("type"), data.get("name"))
for key, value in data.items():
if hasattr(chat, key) and key not in ("messages", "type", "name"):
setattr(chat, key, value)
for id, msg_data in data.get("messages", {}).items():
message = Message.from_json(msg_data)
chat.add_message(id, message)
return chat
def get_last_message(self) -> 'Message':
"""Get the most recent message in the chat."""
return tuple(self._messages.values())[-1]
def items(self):
"""Get message items pairs."""
return self._messages.items()
@@ -230,21 +266,43 @@ class ChatStore:
"""Get all message keys in the chat."""
return self._messages.keys()
def merge_with(self, other: 'ChatStore'):
"""Merge another ChatStore into this one.
Args:
other (ChatStore): The ChatStore to merge with
"""
if not isinstance(other, ChatStore):
raise TypeError("Can only merge with another ChatStore object")
# Update fields if they are not None in the other ChatStore
self.name = other.name or self.name
self.type = other.type or self.type
self.my_avatar = other.my_avatar or self.my_avatar
self.their_avatar = other.their_avatar or self.their_avatar
self.their_avatar_thumb = other.their_avatar_thumb or self.their_avatar_thumb
self.status = other.status or self.status
# Merge messages
self._messages.update(other._messages)
class Message:
"""
Represents a single message in a chat.
"""
def __init__(
self,
*,
from_me: Union[bool, int],
timestamp: int,
time: Union[int, float, str],
key_id: int,
received_timestamp: int,
read_timestamp: int,
timezone_offset: int = 0,
key_id: Union[int, str],
received_timestamp: int = None,
read_timestamp: int = None,
timezone_offset: Optional[Timing] = Timing(0),
message_type: Optional[int] = None
) -> None:
"""
@@ -255,8 +313,8 @@ class Message:
timestamp (int): Message timestamp
time (Union[int, float, str]): Message time
key_id (int): Message unique identifier
received_timestamp (int): When message was received
read_timestamp (int): When message was read
received_timestamp (int, optional): When message was received. Defaults to None
read_timestamp (int, optional): When message was read. Defaults to None
timezone_offset (int, optional): Hours offset from UTC. Defaults to 0
message_type (Optional[int], optional): Type of message. Defaults to None
@@ -265,10 +323,9 @@ class Message:
"""
self.from_me = bool(from_me)
self.timestamp = timestamp / 1000 if timestamp > 9999999999 else timestamp
timing = Timing(timezone_offset)
if isinstance(time, (int, float)):
self.time = timing.format_timestamp(self.timestamp, "%H:%M")
self.time = timezone_offset.format_timestamp(self.timestamp, "%H:%M")
elif isinstance(time, str):
self.time = time
else:
@@ -281,33 +338,51 @@ class Message:
self.sender = None
self.safe = False
self.mime = None
self.message_type = message_type,
self.received_timestamp = timing.format_timestamp(received_timestamp, "%Y/%m/%d %H:%M")
self.read_timestamp = timing.format_timestamp(read_timestamp, "%Y/%m/%d %H:%M")
self.message_type = message_type
if isinstance(received_timestamp, (int, float)):
self.received_timestamp = timezone_offset.format_timestamp(
received_timestamp, "%Y/%m/%d %H:%M")
elif isinstance(received_timestamp, str):
self.received_timestamp = received_timestamp
else:
self.received_timestamp = None
if isinstance(read_timestamp, (int, float)):
self.read_timestamp = timezone_offset.format_timestamp(
read_timestamp, "%Y/%m/%d %H:%M")
elif isinstance(read_timestamp, str):
self.read_timestamp = read_timestamp
else:
self.read_timestamp = None
# Extra attributes
self.reply = None
self.quoted_data = None
self.caption = None
self.thumb = None # Android specific
self.sticker = False
self.reactions = {}
def to_json(self) -> Dict[str, Any]:
"""Convert message to JSON-serializable dict."""
return {
'from_me': self.from_me,
'timestamp': self.timestamp,
'time': self.time,
'media': self.media,
'key_id': self.key_id,
'meta': self.meta,
'data': self.data,
'sender': self.sender,
'safe': self.safe,
'mime': self.mime,
'reply': self.reply,
'quoted_data': self.quoted_data,
'caption': self.caption,
'thumb': self.thumb,
'sticker': self.sticker
}
key: value
for key, value in self.__dict__.items()
}
@classmethod
def from_json(cls, data: Dict) -> 'Message':
message = cls(
from_me=data["from_me"],
timestamp=data["timestamp"],
time=data["time"],
key_id=data["key_id"],
message_type=data.get("message_type"),
received_timestamp=data.get("received_timestamp"),
read_timestamp=data.get("read_timestamp")
)
added = ("from_me", "timestamp", "time", "key_id", "message_type",
"received_timestamp", "read_timestamp")
for key, value in data.items():
if hasattr(message, key) and key not in added:
setattr(message, key, value)
return message

View File

@@ -1,21 +1,25 @@
#!/usr/bin/python3
import os
import logging
from datetime import datetime
from mimetypes import MimeTypes
from tqdm import tqdm
from Whatsapp_Chat_Exporter.data_model import ChatStore, Message
from Whatsapp_Chat_Exporter.utility import Device
from Whatsapp_Chat_Exporter.utility import Device, convert_time_unit
def messages(path, data, assume_first_as_me=False):
"""
Extracts messages from an exported WhatsApp chat file.
Args:
path: Path to the exported chat file
data: Data container object to store the parsed chat
assume_first_as_me: If True, assumes the first message is sent from the user without asking
Returns:
Updated data container with extracted messages
"""
@@ -23,55 +27,54 @@ def messages(path, data, assume_first_as_me=False):
chat = data.add_chat("ExportedChat", ChatStore(Device.EXPORTED))
you = "" # Will store the username of the current user
user_identification_done = False # Flag to track if user identification has been done
# First pass: count total lines for progress reporting
with open(path, "r", encoding="utf8") as file:
total_row_number = sum(1 for _ in file)
# Second pass: process the messages
with open(path, "r", encoding="utf8") as file:
for index, line in enumerate(file):
you, user_identification_done = process_line(
line, index, chat, path, you,
assume_first_as_me, user_identification_done
)
with tqdm(total=total_row_number, desc="Processing messages & media", unit="msg&media", leave=False) as pbar:
for index, line in enumerate(file):
you, user_identification_done = process_line(
line, index, chat, path, you,
assume_first_as_me, user_identification_done
)
pbar.update(1)
total_time = pbar.format_dict['elapsed']
logging.info(f"Processed {total_row_number} messages & media in {convert_time_unit(total_time)}")
# Show progress
if index % 1000 == 0:
print(f"Processing messages & media...({index}/{total_row_number})", end="\r")
print(f"Processing messages & media...({total_row_number}/{total_row_number})")
return data
def process_line(line, index, chat, file_path, you, assume_first_as_me, user_identification_done):
"""
Process a single line from the chat file
Returns:
Tuple of (updated_you_value, updated_user_identification_done_flag)
"""
parts = line.split(" - ", 1)
# Check if this is a new message (has timestamp format)
if len(parts) > 1:
time = parts[0]
you, user_identification_done = process_new_message(
time, parts[1], index, chat, you, file_path,
time, parts[1], index, chat, you, file_path,
assume_first_as_me, user_identification_done
)
else:
# This is a continuation of the previous message
process_message_continuation(line, index, chat)
return you, user_identification_done
def process_new_message(time, content, index, chat, you, file_path,
def process_new_message(time, content, index, chat, you, file_path,
assume_first_as_me, user_identification_done):
"""
Process a line that contains a new message
Returns:
Tuple of (updated_you_value, updated_user_identification_done_flag)
"""
@@ -84,7 +87,7 @@ def process_new_message(time, content, index, chat, you, file_path,
received_timestamp=None,
read_timestamp=None
)
# Check if this is a system message (no name:message format)
if ":" not in content:
msg.data = content
@@ -92,7 +95,7 @@ def process_new_message(time, content, index, chat, you, file_path,
else:
# Process user message
name, message = content.strip().split(":", 1)
# Handle user identification
if you == "":
if chat.name is None:
@@ -109,17 +112,17 @@ def process_new_message(time, content, index, chat, you, file_path,
# If we know the chat name, anyone else must be "you"
if name != chat.name:
you = name
# Set the chat name if needed
if chat.name is None and name != you:
chat.name = name
# Determine if this message is from the current user
msg.from_me = (name == you)
# Process message content
process_message_content(msg, message, file_path)
chat.add_message(index, msg)
return you, user_identification_done
@@ -140,11 +143,11 @@ def process_attached_file(msg, message, file_path):
"""Process an attached file in a message"""
mime = MimeTypes()
msg.media = True
# Extract file path and check if it exists
file_name = message.split("(file attached)")[0].strip()
attached_file_path = os.path.join(os.path.dirname(file_path), file_name)
if os.path.isfile(attached_file_path):
msg.data = attached_file_path
guess = mime.guess_type(attached_file_path)[0]
@@ -161,9 +164,9 @@ def process_message_continuation(line, index, chat):
lookback = index - 1
while lookback not in chat.keys():
lookback -= 1
msg = chat.get_message(lookback)
# Add the continuation line to the message
if msg.media:
msg.caption = line.strip()
@@ -178,4 +181,4 @@ def prompt_for_user_identification(name):
if ans == "y":
return name
elif ans == "n":
return ""
return ""

View File

@@ -1,14 +1,18 @@
#!/usr/bin/python3
import os
import logging
import shutil
from glob import glob
from tqdm import tqdm
from pathlib import Path
from mimetypes import MimeTypes
from markupsafe import escape as htmle
from Whatsapp_Chat_Exporter.data_model import ChatStore, Message
from Whatsapp_Chat_Exporter.utility import APPLE_TIME, CURRENT_TZ_OFFSET, get_chat_condition
from Whatsapp_Chat_Exporter.utility import bytes_to_readable, convert_time_unit, slugify, Device
from Whatsapp_Chat_Exporter.utility import APPLE_TIME, get_chat_condition, Device
from Whatsapp_Chat_Exporter.utility import bytes_to_readable, convert_time_unit, safe_name
def contacts(db, data):
@@ -16,26 +20,28 @@ def contacts(db, data):
c = db.cursor()
c.execute("""SELECT count() FROM ZWAADDRESSBOOKCONTACT WHERE ZABOUTTEXT IS NOT NULL""")
total_row_number = c.fetchone()[0]
print(f"Pre-processing contacts...({total_row_number})")
logging.info(f"Pre-processing contacts...({total_row_number})", extra={"clear": True})
c.execute("""SELECT ZWHATSAPPID, ZABOUTTEXT FROM ZWAADDRESSBOOKCONTACT WHERE ZABOUTTEXT IS NOT NULL""")
content = c.fetchone()
while content is not None:
zwhatsapp_id = content["ZWHATSAPPID"]
if not zwhatsapp_id.endswith("@s.whatsapp.net"):
zwhatsapp_id += "@s.whatsapp.net"
current_chat = ChatStore(Device.IOS)
current_chat.status = content["ZABOUTTEXT"]
data.add_chat(zwhatsapp_id, current_chat)
content = c.fetchone()
with tqdm(total=total_row_number, desc="Processing contacts", unit="contact", leave=False) as pbar:
while (content := c.fetchone()) is not None:
zwhatsapp_id = content["ZWHATSAPPID"]
if not zwhatsapp_id.endswith("@s.whatsapp.net"):
zwhatsapp_id += "@s.whatsapp.net"
current_chat = ChatStore(Device.IOS)
current_chat.status = content["ZABOUTTEXT"]
data.add_chat(zwhatsapp_id, current_chat)
pbar.update(1)
total_time = pbar.format_dict['elapsed']
logging.info(f"Pre-processed {total_row_number} contacts in {convert_time_unit(total_time)}")
def process_contact_avatars(current_chat, media_folder, contact_id):
"""Process and assign avatar images for a contact."""
path = f'{media_folder}/Media/Profile/{contact_id.split("@")[0]}'
avatars = glob(f"{path}*")
if 0 < len(avatars) <= 1:
current_chat.their_avatar = avatars[0]
else:
@@ -55,16 +61,18 @@ def get_contact_name(content):
return content["ZPUSHNAME"]
def messages(db, data, media_folder, timezone_offset, filter_date, filter_chat, filter_empty):
def messages(db, data, media_folder, timezone_offset, filter_date, filter_chat, filter_empty, no_reply):
"""Process WhatsApp messages and contacts from the database."""
c = db.cursor()
cursor2 = db.cursor()
# Build the chat filter conditions
chat_filter_include = get_chat_condition(filter_chat[0], True, ["ZWACHATSESSION.ZCONTACTJID", "ZMEMBERJID"], "ZGROUPINFO", "ios")
chat_filter_exclude = get_chat_condition(filter_chat[1], False, ["ZWACHATSESSION.ZCONTACTJID", "ZMEMBERJID"], "ZGROUPINFO", "ios")
chat_filter_include = get_chat_condition(
filter_chat[0], True, ["ZWACHATSESSION.ZCONTACTJID", "ZMEMBERJID"], "ZGROUPINFO", "ios")
chat_filter_exclude = get_chat_condition(
filter_chat[1], False, ["ZWACHATSESSION.ZCONTACTJID", "ZMEMBERJID"], "ZGROUPINFO", "ios")
date_filter = f'AND ZMESSAGEDATE {filter_date}' if filter_date is not None else ''
# Process contacts first
contact_query = f"""
SELECT count()
@@ -85,7 +93,6 @@ def messages(db, data, media_folder, timezone_offset, filter_date, filter_chat,
"""
c.execute(contact_query)
total_row_number = c.fetchone()[0]
print(f"Processing contacts...({total_row_number})")
# Get distinct contacts
contacts_query = f"""
@@ -105,24 +112,26 @@ def messages(db, data, media_folder, timezone_offset, filter_date, filter_chat,
GROUP BY ZCONTACTJID;
"""
c.execute(contacts_query)
# Process each contact
content = c.fetchone()
while content is not None:
contact_name = get_contact_name(content)
contact_id = content["ZCONTACTJID"]
# Add or update chat
if contact_id not in data:
current_chat = data.add_chat(contact_id, ChatStore(Device.IOS, contact_name, media_folder))
else:
current_chat = data.get_chat(contact_id)
current_chat.name = contact_name
current_chat.my_avatar = os.path.join(media_folder, "Media/Profile/Photo.jpg")
# Process avatar images
process_contact_avatars(current_chat, media_folder, contact_id)
content = c.fetchone()
with tqdm(total=total_row_number, desc="Processing contacts", unit="contact", leave=False) as pbar:
while (content := c.fetchone()) is not None:
contact_name = get_contact_name(content)
contact_id = content["ZCONTACTJID"]
# Add or update chat
if contact_id not in data:
current_chat = data.add_chat(contact_id, ChatStore(Device.IOS, contact_name, media_folder))
else:
current_chat = data.get_chat(contact_id)
current_chat.name = contact_name
current_chat.my_avatar = os.path.join(media_folder, "Media/Profile/Photo.jpg")
# Process avatar images
process_contact_avatars(current_chat, media_folder, contact_id)
pbar.update(1)
total_time = pbar.format_dict['elapsed']
logging.info(f"Processed {total_row_number} contacts in {convert_time_unit(total_time)}")
# Get message count
message_count_query = f"""
@@ -139,8 +148,8 @@ def messages(db, data, media_folder, timezone_offset, filter_date, filter_chat,
"""
c.execute(message_count_query)
total_row_number = c.fetchone()[0]
print(f"Processing messages...(0/{total_row_number})", end="\r")
logging.info(f"Processing messages...(0/{total_row_number})", extra={"clear": True})
# Fetch messages
messages_query = f"""
SELECT ZCONTACTJID,
@@ -168,52 +177,58 @@ def messages(db, data, media_folder, timezone_offset, filter_date, filter_chat,
ORDER BY ZMESSAGEDATE ASC;
"""
c.execute(messages_query)
reply_query = """SELECT ZSTANZAID,
ZTEXT,
ZTITLE
FROM ZWAMESSAGE
LEFT JOIN ZWAMEDIAITEM
ON ZWAMESSAGE.Z_PK = ZWAMEDIAITEM.ZMESSAGE
WHERE ZTEXT IS NOT NULL
OR ZTITLE IS NOT NULL;"""
cursor2.execute(reply_query)
message_map = {row[0][:17]: row[1] or row[2] for row in cursor2.fetchall() if row[0]}
# Process each message
i = 0
content = c.fetchone()
while content is not None:
contact_id = content["ZCONTACTJID"]
message_pk = content["Z_PK"]
is_group_message = content["ZGROUPINFO"] is not None
# Ensure chat exists
if contact_id not in data:
current_chat = data.add_chat(contact_id, ChatStore(Device.IOS))
process_contact_avatars(current_chat, media_folder, contact_id)
else:
current_chat = data.get_chat(contact_id)
# Create message object
ts = APPLE_TIME + content["ZMESSAGEDATE"]
message = Message(
from_me=content["ZISFROMME"],
timestamp=ts,
time=ts,
key_id=content["ZSTANZAID"][:17],
timezone_offset=timezone_offset if timezone_offset else CURRENT_TZ_OFFSET,
message_type=content["ZMESSAGETYPE"],
received_timestamp=APPLE_TIME + content["ZSENTDATE"] if content["ZSENTDATE"] else None,
read_timestamp=None # TODO: Add timestamp
)
# Process message data
invalid = process_message_data(message, content, is_group_message, data, cursor2)
# Add valid messages to chat
if not invalid:
current_chat.add_message(message_pk, message)
# Update progress
i += 1
if i % 1000 == 0:
print(f"Processing messages...({i}/{total_row_number})", end="\r")
content = c.fetchone()
print(f"Processing messages...({total_row_number}/{total_row_number})", end="\r")
with tqdm(total=total_row_number, desc="Processing messages", unit="msg", leave=False) as pbar:
while (content := c.fetchone()) is not None:
contact_id = content["ZCONTACTJID"]
message_pk = content["Z_PK"]
is_group_message = content["ZGROUPINFO"] is not None
# Ensure chat exists
if contact_id not in data:
current_chat = data.add_chat(contact_id, ChatStore(Device.IOS))
process_contact_avatars(current_chat, media_folder, contact_id)
else:
current_chat = data.get_chat(contact_id)
# Create message object
ts = APPLE_TIME + content["ZMESSAGEDATE"]
message = Message(
from_me=content["ZISFROMME"],
timestamp=ts,
time=ts,
key_id=content["ZSTANZAID"][:17],
timezone_offset=timezone_offset,
message_type=content["ZMESSAGETYPE"],
received_timestamp=APPLE_TIME + content["ZSENTDATE"] if content["ZSENTDATE"] else None,
read_timestamp=None # TODO: Add timestamp
)
# Process message data
invalid = process_message_data(message, content, is_group_message, data, message_map, no_reply)
# Add valid messages to chat
if not invalid:
current_chat.add_message(message_pk, message)
pbar.update(1)
total_time = pbar.format_dict['elapsed']
logging.info(f"Processed {total_row_number} messages in {convert_time_unit(total_time)}")
def process_message_data(message, content, is_group_message, data, cursor2):
def process_message_data(message, content, is_group_message, data, message_map, no_reply):
"""Process and set message data from content row."""
# Handle group sender info
if is_group_message and content["ZISFROMME"] == 0:
@@ -230,31 +245,24 @@ def process_message_data(message, content, is_group_message, data, cursor2):
message.sender = name or fallback
else:
message.sender = None
# Handle metadata messages
if content["ZMESSAGETYPE"] == 6:
return process_metadata_message(message, content, is_group_message)
# Handle quoted replies
if content["ZMETADATA"] is not None and content["ZMETADATA"].startswith(b"\x2a\x14") and False:
if content["ZMETADATA"] is not None and content["ZMETADATA"].startswith(b"\x2a\x14") and not no_reply:
quoted = content["ZMETADATA"][2:19]
message.reply = quoted.decode()
cursor2.execute(f"""SELECT ZTEXT
FROM ZWAMESSAGE
WHERE ZSTANZAID LIKE '{message.reply}%'""")
quoted_content = cursor2.fetchone()
if quoted_content and "ZTEXT" in quoted_content:
message.quoted_data = quoted_content["ZTEXT"]
else:
message.quoted_data = None
message.quoted_data = message_map.get(message.reply)
# Handle stickers
if content["ZMESSAGETYPE"] == 15:
message.sticker = True
# Process message text
process_message_text(message, content)
return False # Message is valid
@@ -299,19 +307,21 @@ def process_message_text(message, content):
msg = content["ZTEXT"]
if msg is not None:
msg = msg.replace("\r\n", "<br>").replace("\n", "<br>")
message.data = msg
def media(db, data, media_folder, filter_date, filter_chat, filter_empty, separate_media=False):
def media(db, data, media_folder, filter_date, filter_chat, filter_empty, separate_media=False, fix_dot_files=False):
"""Process media files from WhatsApp messages."""
c = db.cursor()
# Build filter conditions
chat_filter_include = get_chat_condition(filter_chat[0], True, ["ZWACHATSESSION.ZCONTACTJID","ZMEMBERJID"], "ZGROUPINFO", "ios")
chat_filter_exclude = get_chat_condition(filter_chat[1], False, ["ZWACHATSESSION.ZCONTACTJID", "ZMEMBERJID"], "ZGROUPINFO", "ios")
chat_filter_include = get_chat_condition(
filter_chat[0], True, ["ZWACHATSESSION.ZCONTACTJID", "ZMEMBERJID"], "ZGROUPINFO", "ios")
chat_filter_exclude = get_chat_condition(
filter_chat[1], False, ["ZWACHATSESSION.ZCONTACTJID", "ZMEMBERJID"], "ZGROUPINFO", "ios")
date_filter = f'AND ZMESSAGEDATE {filter_date}' if filter_date is not None else ''
# Get media count
media_count_query = f"""
SELECT count()
@@ -329,8 +339,8 @@ def media(db, data, media_folder, filter_date, filter_chat, filter_empty, separa
"""
c.execute(media_count_query)
total_row_number = c.fetchone()[0]
print(f"\nProcessing media...(0/{total_row_number})", end="\r")
logging.info(f"Processing media...(0/{total_row_number})", extra={"clear": True})
# Fetch media items
media_query = f"""
SELECT ZCONTACTJID,
@@ -354,36 +364,28 @@ def media(db, data, media_folder, filter_date, filter_chat, filter_empty, separa
ORDER BY ZCONTACTJID ASC
"""
c.execute(media_query)
# Process each media item
mime = MimeTypes()
i = 0
content = c.fetchone()
while content is not None:
process_media_item(content, data, media_folder, mime, separate_media)
# Update progress
i += 1
if i % 100 == 0:
print(f"Processing media...({i}/{total_row_number})", end="\r")
content = c.fetchone()
print(f"Processing media...({total_row_number}/{total_row_number})", end="\r")
with tqdm(total=total_row_number, desc="Processing media", unit="media", leave=False) as pbar:
while (content := c.fetchone()) is not None:
process_media_item(content, data, media_folder, mime, separate_media, fix_dot_files)
pbar.update(1)
total_time = pbar.format_dict['elapsed']
logging.info(f"Processed {total_row_number} media in {convert_time_unit(total_time)}")
def process_media_item(content, data, media_folder, mime, separate_media):
def process_media_item(content, data, media_folder, mime, separate_media, fix_dot_files=False):
"""Process a single media item."""
file_path = f"{media_folder}/Message/{content['ZMEDIALOCALPATH']}"
current_chat = data.get_chat(content["ZCONTACTJID"])
message = current_chat.get_message(content["ZMESSAGE"])
message.media = True
if current_chat.media_base == "":
current_chat.media_base = media_folder + "/"
if os.path.isfile(file_path):
message.data = '/'.join(file_path.split("/")[1:])
# Set MIME type
if content["ZVCARDSTRING"] is None:
guess = mime.guess_type(file_path)[0]
@@ -391,21 +393,34 @@ def process_media_item(content, data, media_folder, mime, separate_media):
else:
message.mime = content["ZVCARDSTRING"]
if fix_dot_files and file_path.endswith("."):
extension = mime.guess_extension(message.mime)
if message.mime == "application/octet-stream" or not extension:
new_file_path = file_path[:-1]
else:
extension = mime.guess_extension(message.mime)
new_file_path = file_path[:-1] + extension
os.rename(file_path, new_file_path)
file_path = new_file_path
# Handle separate media option
if separate_media:
chat_display_name = slugify(current_chat.name or message.sender or content["ZCONTACTJID"].split('@')[0], True)
chat_display_name = safe_name(
current_chat.name or message.sender or content["ZCONTACTJID"].split('@')[0])
current_filename = file_path.split("/")[-1]
new_folder = os.path.join(media_folder, "separated", chat_display_name)
Path(new_folder).mkdir(parents=True, exist_ok=True)
new_path = os.path.join(new_folder, current_filename)
shutil.copy2(file_path, new_path)
message.data = '/'.join(new_path.split("\\")[1:])
message.data = '/'.join(new_path.split("/")[1:])
else:
message.data = '/'.join(file_path.split("/")[1:])
else:
# Handle missing media
message.data = "The media is missing"
message.mime = "media"
message.meta = True
# Add caption if available
if content["ZTITLE"] is not None:
message.caption = content["ZTITLE"]
@@ -414,12 +429,14 @@ def process_media_item(content, data, media_folder, mime, separate_media):
def vcard(db, data, media_folder, filter_date, filter_chat, filter_empty):
"""Process vCard contacts from WhatsApp messages."""
c = db.cursor()
# Build filter conditions
chat_filter_include = get_chat_condition(filter_chat[0], True, ["ZCONTACTJID", "ZMEMBERJID"], "ZGROUPINFO", "ios")
chat_filter_exclude = get_chat_condition(filter_chat[1], False, ["ZCONTACTJID", "ZMEMBERJID"], "ZGROUPINFO", "ios")
chat_filter_include = get_chat_condition(
filter_chat[0], True, ["ZCONTACTJID", "ZMEMBERJID"], "ZGROUPINFO", "ios")
chat_filter_exclude = get_chat_condition(
filter_chat[1], False, ["ZCONTACTJID", "ZMEMBERJID"], "ZGROUPINFO", "ios")
date_filter = f'AND ZWAMESSAGE.ZMESSAGEDATE {filter_date}' if filter_date is not None else ''
# Fetch vCard mentions
vcard_query = f"""
SELECT DISTINCT ZWAVCARDMENTION.ZMEDIAITEM,
@@ -444,16 +461,19 @@ def vcard(db, data, media_folder, filter_date, filter_chat, filter_empty):
c.execute(vcard_query)
contents = c.fetchall()
total_row_number = len(contents)
print(f"\nProcessing vCards...(0/{total_row_number})", end="\r")
logging.info(f"Processing vCards...(0/{total_row_number})", extra={"clear": True})
# Create vCards directory
path = f'{media_folder}/Message/vCards'
Path(path).mkdir(parents=True, exist_ok=True)
# Process each vCard
for index, content in enumerate(contents):
process_vcard_item(content, path, data)
print(f"Processing vCards...({index + 1}/{total_row_number})", end="\r")
with tqdm(total=total_row_number, desc="Processing vCards", unit="vcard", leave=False) as pbar:
for content in contents:
process_vcard_item(content, path, data)
pbar.update(1)
total_time = pbar.format_dict['elapsed']
logging.info(f"Processed {total_row_number} vCards in {convert_time_unit(total_time)}")
def process_vcard_item(content, path, data):
@@ -478,9 +498,10 @@ def process_vcard_item(content, path, data):
f.write(vcard_string)
# Create vCard summary and update message
vcard_summary = "This media include the following vCard file(s):<br>"
vcard_summary += " | ".join([f'<a href="{htmle(fp)}">{htmle(name)}</a>' for name, fp in zip(vcard_names, file_paths)])
vcard_summary = "This media include the following vCard file(s):<br>"
vcard_summary += " | ".join([f'<a href="{htmle(fp)}">{htmle(name)}</a>' for name,
fp in zip(vcard_names, file_paths)])
message = data.get_chat(content["ZCONTACTJID"]).get_message(content["ZMESSAGE"])
message.data = vcard_summary
message.mime = "text/x-vcard"
@@ -492,11 +513,13 @@ def process_vcard_item(content, path, data):
def calls(db, data, timezone_offset, filter_chat):
"""Process WhatsApp call records."""
c = db.cursor()
# Build filter conditions
chat_filter_include = get_chat_condition(filter_chat[0], True, ["ZGROUPCALLCREATORUSERJIDSTRING"], None, "ios")
chat_filter_exclude = get_chat_condition(filter_chat[1], False, ["ZGROUPCALLCREATORUSERJIDSTRING"], None, "ios")
chat_filter_include = get_chat_condition(
filter_chat[0], True, ["ZGROUPCALLCREATORUSERJIDSTRING"], None, "ios")
chat_filter_exclude = get_chat_condition(
filter_chat[1], False, ["ZGROUPCALLCREATORUSERJIDSTRING"], None, "ios")
# Get call count
call_count_query = f"""
SELECT count()
@@ -509,9 +532,7 @@ def calls(db, data, timezone_offset, filter_chat):
total_row_number = c.fetchone()[0]
if total_row_number == 0:
return
print(f"\nProcessing calls...({total_row_number})", end="\r")
# Fetch call records
calls_query = f"""
SELECT ZCALLIDSTRING,
@@ -532,18 +553,19 @@ def calls(db, data, timezone_offset, filter_chat):
{chat_filter_exclude}
"""
c.execute(calls_query)
# Create calls chat
chat = ChatStore(Device.ANDROID, "WhatsApp Calls")
# Process each call
content = c.fetchone()
while content is not None:
process_call_record(content, chat, data, timezone_offset)
content = c.fetchone()
with tqdm(total=total_row_number, desc="Processing calls", unit="call", leave=False) as pbar:
while (content := c.fetchone()) is not None:
process_call_record(content, chat, data, timezone_offset)
pbar.update(1)
total_time = pbar.format_dict['elapsed']
# Add calls chat to data
data.add_chat("000000000000000", chat)
logging.info(f"Processed {total_row_number} calls in {convert_time_unit(total_time)}")
def process_call_record(content, chat, data, timezone_offset):
@@ -554,9 +576,9 @@ def process_call_record(content, chat, data, timezone_offset):
timestamp=ts,
time=ts,
key_id=content["ZCALLIDSTRING"],
timezone_offset=timezone_offset if timezone_offset else CURRENT_TZ_OFFSET
timezone_offset=timezone_offset
)
# Set sender info
_jid = content["ZGROUPCALLCREATORUSERJIDSTRING"]
name = data.get_chat(_jid).name if _jid in data else None
@@ -565,11 +587,11 @@ def process_call_record(content, chat, data, timezone_offset):
else:
fallback = None
call.sender = name or fallback
# Set call metadata
call.meta = True
call.data = format_call_data(call, content)
# Add call to chat
chat.add_message(call.key_id, call)
@@ -583,7 +605,7 @@ def format_call_data(call, content):
f"call {'to' if call.from_me else 'from'} "
f"{call.sender} was "
)
# Call outcome
if content['ZOUTCOME'] in (1, 4):
call_data += "not answered." if call.from_me else "missed."
@@ -598,5 +620,5 @@ def format_call_data(call, content):
)
else:
call_data += "in an unknown state."
return call_data
return call_data

View File

@@ -1,11 +1,14 @@
#!/usr/bin/python3
import logging
import shutil
import sqlite3
import os
import getpass
from sys import exit
from Whatsapp_Chat_Exporter.utility import WhatsAppIdentifier
from sys import exit, platform as osname
import sys
from tqdm import tqdm
from Whatsapp_Chat_Exporter.utility import WhatsAppIdentifier, convert_time_unit
from Whatsapp_Chat_Exporter.bplist import BPListReader
try:
from iphone_backup_decrypt import EncryptedBackup, RelativePath
@@ -15,6 +18,8 @@ else:
support_encrypted = True
class BackupExtractor:
"""
A class to handle the extraction of WhatsApp data from iOS backups,
@@ -42,28 +47,41 @@ class BackupExtractor:
Returns:
bool: True if encrypted, False otherwise.
"""
with sqlite3.connect(os.path.join(self.base_dir, "Manifest.db")) as db:
c = db.cursor()
try:
c.execute("SELECT count() FROM Files")
c.fetchone() # Execute and fetch to trigger potential errors
except (sqlite3.OperationalError, sqlite3.DatabaseError):
return True
try:
with sqlite3.connect(os.path.join(self.base_dir, "Manifest.db")) as db:
c = db.cursor()
try:
c.execute("SELECT count() FROM Files")
c.fetchone() # Execute and fetch to trigger potential errors
except (sqlite3.OperationalError, sqlite3.DatabaseError):
return True
else:
return False
except sqlite3.DatabaseError as e:
if str(e) == "authorization denied" and osname == "darwin":
logging.error(
"You don't have permission to access the backup database. Please"
"check your permissions or try moving the backup to somewhere else."
)
exit(8)
else:
return False
raise e
def _extract_encrypted_backup(self):
"""
Handles the extraction of data from an encrypted iOS backup.
"""
if not support_encrypted:
print("You don't have the dependencies to handle encrypted backup.")
print("Read more on how to deal with encrypted backup:")
print("https://github.com/KnugiHK/Whatsapp-Chat-Exporter/blob/main/README.md#usage")
logging.error("You don't have the dependencies to handle encrypted backup."
"Read more on how to deal with encrypted backup:"
"https://github.com/KnugiHK/Whatsapp-Chat-Exporter/blob/main/README.md#usage"
)
return
print("Encryption detected on the backup!")
logging.info(f"Encryption detected on the backup!")
password = getpass.getpass("Enter the password for the backup:")
sys.stdout.write("\033[F\033[K")
sys.stdout.flush()
self._decrypt_backup(password)
self._extract_decrypted_files()
@@ -74,7 +92,7 @@ class BackupExtractor:
Args:
password (str): The password for the encrypted backup.
"""
print("Trying to decrypt the iOS backup...", end="")
logging.info(f"Trying to open the iOS backup...")
self.backup = EncryptedBackup(
backup_directory=self.base_dir,
passphrase=password,
@@ -82,7 +100,8 @@ class BackupExtractor:
check_same_thread=False,
decrypt_chunk_size=self.decrypt_chunk_size,
)
print("Done\nDecrypting WhatsApp database...", end="")
logging.info(f"iOS backup is opened successfully")
logging.info("Decrypting WhatsApp database...", extra={"clear": True})
try:
self.backup.extract_file(
relative_path=RelativePath.WHATSAPP_MESSAGES,
@@ -100,23 +119,26 @@ class BackupExtractor:
output_filename=self.identifiers.CALL,
)
except ValueError:
print("Failed to decrypt backup: incorrect password?")
logging.error("Failed to decrypt backup: incorrect password?")
exit(7)
except FileNotFoundError:
print(
logging.error(
"Essential WhatsApp files are missing from the iOS backup. "
"Perhapse you enabled end-to-end encryption for the backup? "
"See https://wts.knugi.dev/docs.html?dest=iose2e"
)
exit(6)
else:
print("Done")
logging.info(f"WhatsApp database decrypted successfully")
def _extract_decrypted_files(self):
"""Extract all WhatsApp files after decryption"""
pbar = tqdm(desc="Decrypting and extracting files", unit="file", leave=False)
def extract_progress_handler(file_id, domain, relative_path, n, total_files):
if n % 100 == 0:
print(f"Decrypting and extracting files...({n}/{total_files})", end="\r")
if pbar.total is None:
pbar.total = total_files
pbar.n = n
pbar.refresh()
return True
self.backup.extract_files(
@@ -125,7 +147,9 @@ class BackupExtractor:
preserve_folders=True,
filter_callback=extract_progress_handler
)
print(f"All required files are decrypted and extracted. ", end="\n")
total_time = pbar.format_dict['elapsed']
pbar.close()
logging.info(f"All required files are decrypted and extracted in {convert_time_unit(total_time)}")
def _extract_unencrypted_backup(self):
"""
@@ -144,10 +168,10 @@ class BackupExtractor:
if not os.path.isfile(wts_db_path):
if self.identifiers is WhatsAppIdentifier:
print("WhatsApp database not found.")
logging.error("WhatsApp database not found.")
else:
print("WhatsApp Business database not found.")
print(
logging.error("WhatsApp Business database not found.")
logging.error(
"Essential WhatsApp files are missing from the iOS backup. "
"Perhapse you enabled end-to-end encryption for the backup? "
"See https://wts.knugi.dev/docs.html?dest=iose2e"
@@ -157,12 +181,12 @@ class BackupExtractor:
shutil.copyfile(wts_db_path, self.identifiers.MESSAGE)
if not os.path.isfile(contact_db_path):
print("Contact database not found. Skipping...")
logging.warning(f"Contact database not found. Skipping...")
else:
shutil.copyfile(contact_db_path, self.identifiers.CONTACT)
if not os.path.isfile(call_db_path):
print("Call database not found. Skipping...")
logging.warning(f"Call database not found. Skipping...")
else:
shutil.copyfile(call_db_path, self.identifiers.CALL)
@@ -176,7 +200,6 @@ class BackupExtractor:
c = manifest.cursor()
c.execute(f"SELECT count() FROM Files WHERE domain = '{_wts_id}'")
total_row_number = c.fetchone()[0]
print(f"Extracting WhatsApp files...(0/{total_row_number})", end="\r")
c.execute(
f"""
SELECT fileID, relativePath, flags, file AS metadata,
@@ -189,33 +212,30 @@ class BackupExtractor:
if not os.path.isdir(_wts_id):
os.mkdir(_wts_id)
row = c.fetchone()
while row is not None:
if not row["relativePath"]: # Skip empty relative paths
row = c.fetchone()
continue
with tqdm(total=total_row_number, desc="Extracting WhatsApp files", unit="file", leave=False) as pbar:
while (row := c.fetchone()) is not None:
if not row["relativePath"]: # Skip empty relative paths
continue
destination = os.path.join(_wts_id, row["relativePath"])
hashes = row["fileID"]
folder = hashes[:2]
flags = row["flags"]
destination = os.path.join(_wts_id, row["relativePath"])
hashes = row["fileID"]
folder = hashes[:2]
flags = row["flags"]
if flags == 2: # Directory
try:
os.mkdir(destination)
except FileExistsError:
pass
elif flags == 1: # File
shutil.copyfile(os.path.join(self.base_dir, folder, hashes), destination)
metadata = BPListReader(row["metadata"]).parse()
creation = metadata["$objects"][1]["Birth"]
modification = metadata["$objects"][1]["LastModified"]
os.utime(destination, (modification, modification))
if row["_index"] % 100 == 0:
print(f"Extracting WhatsApp files...({row['_index']}/{total_row_number})", end="\r")
row = c.fetchone()
print(f"Extracting WhatsApp files...({total_row_number}/{total_row_number})", end="\n")
if flags == 2: # Directory
try:
os.mkdir(destination)
except FileExistsError:
pass
elif flags == 1: # File
shutil.copyfile(os.path.join(self.base_dir, folder, hashes), destination)
metadata = BPListReader(row["metadata"]).parse()
_creation = metadata["$objects"][1]["Birth"]
modification = metadata["$objects"][1]["LastModified"]
os.utime(destination, (modification, modification))
pbar.update(1)
total_time = pbar.format_dict['elapsed']
logging.info(f"Extracted {total_row_number} WhatsApp files in {convert_time_unit(total_time)}")
def extract_media(base_dir, identifiers, decrypt_chunk_size):
@@ -229,4 +249,3 @@ def extract_media(base_dir, identifiers, decrypt_chunk_size):
"""
extractor = BackupExtractor(base_dir, identifiers, decrypt_chunk_size)
extractor.extract()

View File

@@ -1,3 +1,4 @@
import logging
import sqlite3
import jinja2
import json
@@ -5,18 +6,21 @@ import os
import unicodedata
import re
import math
import shutil
from bleach import clean as sanitize
from markupsafe import Markup
from datetime import datetime, timedelta
from enum import IntEnum
from Whatsapp_Chat_Exporter.data_model import ChatStore
from typing import Dict, List, Optional, Tuple
from tqdm import tqdm
from Whatsapp_Chat_Exporter.data_model import ChatCollection, ChatStore, Timing
from typing import Dict, List, Optional, Tuple, Union, Any
try:
from enum import StrEnum, IntEnum
except ImportError:
# < Python 3.11
# This should be removed when the support for Python 3.10 ends.
# This should be removed when the support for Python 3.10 ends. (31 Oct 2026)
from enum import Enum
class StrEnum(str, Enum):
pass
@@ -28,6 +32,7 @@ ROW_SIZE = 0x3D0
CURRENT_TZ_OFFSET = datetime.now().astimezone().utcoffset().seconds / 3600
def convert_time_unit(time_second: int) -> str:
"""Converts a time duration in seconds to a human-readable string.
@@ -37,23 +42,31 @@ def convert_time_unit(time_second: int) -> str:
Returns:
str: A human-readable string representing the time duration.
"""
time = str(timedelta(seconds=time_second))
if "day" not in time:
if time_second < 1:
time = "less than a second"
elif time_second == 1:
time = "a second"
elif time_second < 60:
time = time[5:][1 if time_second < 10 else 0:] + " seconds"
elif time_second == 60:
time = "a minute"
elif time_second < 3600:
time = time[2:] + " minutes"
elif time_second == 3600:
time = "an hour"
else:
time += " hour"
return time
if time_second < 1:
return "less than a second"
elif time_second == 1:
return "a second"
delta = timedelta(seconds=time_second)
parts = []
days = delta.days
if days > 0:
parts.append(f"{days} day{'s' if days > 1 else ''}")
hours = delta.seconds // 3600
if hours > 0:
parts.append(f"{hours} hour{'s' if hours > 1 else ''}")
minutes = (delta.seconds % 3600) // 60
if minutes > 0:
parts.append(f"{minutes} minute{'s' if minutes > 1 else ''}")
seconds = delta.seconds % 60
if seconds > 0:
parts.append(f"{seconds} second{'s' if seconds > 1 else ''}")
return " ".join(parts)
def bytes_to_readable(size_bytes: int) -> str:
@@ -70,8 +83,8 @@ def bytes_to_readable(size_bytes: int) -> str:
Returns:
A human-readable string representing the file size.
"""
if size_bytes == 0:
return "0B"
if size_bytes < 1024:
return f"{size_bytes} B"
size_name = ("B", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB")
i = int(math.floor(math.log(size_bytes, 1024)))
p = math.pow(1024, i)
@@ -99,14 +112,19 @@ def readable_to_bytes(size_str: str) -> int:
'TB': 1024**4,
'PB': 1024**5,
'EB': 1024**6,
'ZB': 1024**7,
'ZB': 1024**7,
'YB': 1024**8
}
size_str = size_str.upper().strip()
number, unit = size_str[:-2].strip(), size_str[-2:].strip()
if unit not in SIZE_UNITS or not number.isnumeric():
raise ValueError("Invalid input for size_str. Example: 1024GB")
return int(number) * SIZE_UNITS[unit]
if size_str.isnumeric():
# If the string is purely numeric, assume it's in bytes
return int(size_str)
match = re.fullmatch(r'^(\d+(\.\d*)?)\s*([KMGTPEZY]?B)?$', size_str)
if not match:
raise ValueError("Invalid size format for size_str. Expected format like '10MB', '1024GB', or '512'.")
unit = ''.join(filter(str.isalpha, size_str)).strip()
number = ''.join(c for c in size_str if c.isdigit() or c == '.').strip()
return int(float(number) * SIZE_UNITS[unit])
def sanitize_except(html: str) -> Markup:
@@ -139,51 +157,55 @@ def determine_day(last: int, current: int) -> Optional[datetime.date]:
return current
def check_update():
def check_update(include_beta: bool = False) -> int:
import urllib.request
import json
import importlib
from sys import platform
from packaging import version
PACKAGE_JSON = "https://pypi.org/pypi/whatsapp-chat-exporter/json"
try:
raw = urllib.request.urlopen(PACKAGE_JSON)
except Exception:
print("Failed to check for updates.")
logging.error("Failed to check for updates.")
return 1
else:
with raw:
package_info = json.load(raw)
latest_version = tuple(map(int, package_info["info"]["version"].split(".")))
__version__ = importlib.metadata.version("whatsapp_chat_exporter")
current_version = tuple(map(int, __version__.split(".")))
if current_version < latest_version:
print("===============Update===============")
print("A newer version of WhatsApp Chat Exporter is available.")
print("Current version: " + __version__)
print("Latest version: " + package_info["info"]["version"])
if platform == "win32":
print("Update with: pip install --upgrade whatsapp-chat-exporter")
else:
print("Update with: pip3 install --upgrade whatsapp-chat-exporter")
print("====================================")
if include_beta:
all_versions = [version.parse(v) for v in package_info["releases"].keys()]
latest_version = max(all_versions, key=lambda v: (v.release, v.pre))
else:
print("You are using the latest version of WhatsApp Chat Exporter.")
latest_version = version.parse(package_info["info"]["version"])
current_version = version.parse(importlib.metadata.version("whatsapp_chat_exporter"))
if current_version < latest_version:
logging.info(
"===============Update===============\n"
"A newer version of WhatsApp Chat Exporter is available.\n"
f"Current version: {current_version}\n"
f"Latest version: {latest_version}"
)
pip_cmd = "pip" if platform == "win32" else "pip3"
logging.info(f"Update with: {pip_cmd} install --upgrade whatsapp-chat-exporter {'--pre' if include_beta else ''}")
logging.info("====================================")
else:
logging.info("You are using the latest version of WhatsApp Chat Exporter.")
return 0
def rendering(
output_file_name,
template,
name,
msgs,
contact,
w3css,
chat,
headline,
next=False,
previous=False
):
output_file_name,
template,
name,
msgs,
contact,
w3css,
chat,
headline,
next=False,
previous=False
):
if chat.their_avatar_thumb is None and chat.their_avatar is not None:
their_avatar_thumb = chat.their_avatar
else:
@@ -215,59 +237,250 @@ class Device(StrEnum):
EXPORTED = "exported"
def import_from_json(json_file: str, data: Dict[str, ChatStore]):
def import_from_json(json_file: str, data: ChatCollection):
"""Imports chat data from a JSON file into the data dictionary.
Args:
json_file: The path to the JSON file.
data: The dictionary to store the imported chat data.
"""
from Whatsapp_Chat_Exporter.data_model import ChatStore, Message
with open(json_file, "r") as f:
temp_data = json.loads(f.read())
total_row_number = len(tuple(temp_data.keys()))
print(f"Importing chats from JSON...(0/{total_row_number})", end="\r")
for index, (jid, chat_data) in enumerate(temp_data.items()):
chat = ChatStore(chat_data.get("type"), chat_data.get("name"))
chat.my_avatar = chat_data.get("my_avatar")
chat.their_avatar = chat_data.get("their_avatar")
chat.their_avatar_thumb = chat_data.get("their_avatar_thumb")
chat.status = chat_data.get("status")
for id, msg in chat_data.get("messages").items():
message = Message(
from_me=msg["from_me"],
timestamp=msg["timestamp"],
time=msg["time"],
key_id=msg["key_id"],
received_timestamp=msg.get("received_timestamp"),
read_timestamp=msg.get("read_timestamp")
with tqdm(total=total_row_number, desc="Importing chats from JSON", unit="chat", leave=False) as pbar:
for jid, chat_data in temp_data.items():
chat = ChatStore.from_json(chat_data)
data.add_chat(jid, chat)
pbar.update(1)
total_time = pbar.format_dict['elapsed']
logging.info(f"Imported {total_row_number} chats from JSON in {convert_time_unit(total_time)}")
class IncrementalMerger:
"""Handles incremental merging of WhatsApp chat exports."""
def __init__(self, pretty_print_json: int, avoid_encoding_json: bool):
"""Initialize the merger with JSON formatting options.
Args:
pretty_print_json: JSON indentation level.
avoid_encoding_json: Whether to avoid ASCII encoding.
"""
self.pretty_print_json = pretty_print_json
self.avoid_encoding_json = avoid_encoding_json
def _get_json_files(self, source_dir: str) -> List[str]:
"""Get list of JSON files from source directory.
Args:
source_dir: Path to the source directory.
Returns:
List of JSON filenames.
Raises:
SystemExit: If no JSON files are found.
"""
json_files = [f for f in os.listdir(source_dir) if f.endswith('.json')]
if not json_files:
logging.error("No JSON files found in the source directory.")
raise SystemExit(1)
logging.debug("JSON files found:", json_files)
return json_files
def _copy_new_file(self, source_path: str, target_path: str, target_dir: str, json_file: str) -> None:
"""Copy a new JSON file to target directory.
Args:
source_path: Path to source file.
target_path: Path to target file.
target_dir: Target directory path.
json_file: Name of the JSON file.
"""
logging.info(f"Copying '{json_file}' to target directory...")
os.makedirs(target_dir, exist_ok=True)
shutil.copy2(source_path, target_path)
def _load_chat_data(self, file_path: str) -> Dict[str, Any]:
"""Load JSON data from file.
Args:
file_path: Path to JSON file.
Returns:
Loaded JSON data.
"""
with open(file_path, 'r') as file:
return json.load(file)
def _parse_chats_from_json(self, data: Dict[str, Any]) -> Dict[str, Any]:
"""Parse JSON data into ChatStore objects.
Args:
data: Raw JSON data.
Returns:
Dictionary of JID to ChatStore objects.
"""
return {jid: ChatStore.from_json(chat) for jid, chat in data.items()}
def _merge_chat_stores(self, source_chats: Dict[str, Any], target_chats: Dict[str, Any]) -> Dict[str, Any]:
"""Merge source chats into target chats.
Args:
source_chats: Source ChatStore objects.
target_chats: Target ChatStore objects.
Returns:
Merged ChatStore objects.
"""
for jid, chat in source_chats.items():
if jid in target_chats:
target_chats[jid].merge_with(chat)
else:
target_chats[jid] = chat
return target_chats
def _serialize_chats(self, chats: Dict[str, Any]) -> Dict[str, Any]:
"""Serialize ChatStore objects to JSON format.
Args:
chats: Dictionary of ChatStore objects.
Returns:
Serialized JSON data.
"""
return {jid: chat.to_json() for jid, chat in chats.items()}
def _has_changes(self, merged_data: Dict[str, Any], original_data: Dict[str, Any]) -> bool:
"""Check if merged data differs from original data.
Args:
merged_data: Merged JSON data.
original_data: Original JSON data.
Returns:
True if changes detected, False otherwise.
"""
return json.dumps(merged_data, sort_keys=True) != json.dumps(original_data, sort_keys=True)
def _save_merged_data(self, target_path: str, merged_data: Dict[str, Any]) -> None:
"""Save merged data to target file.
Args:
target_path: Path to target file.
merged_data: Merged JSON data.
"""
with open(target_path, 'w') as merged_file:
json.dump(
merged_data,
merged_file,
indent=self.pretty_print_json,
ensure_ascii=not self.avoid_encoding_json,
)
message.media = msg.get("media")
message.meta = msg.get("meta")
message.data = msg.get("data")
message.sender = msg.get("sender")
message.safe = msg.get("safe")
message.mime = msg.get("mime")
message.reply = msg.get("reply")
message.quoted_data = msg.get("quoted_data")
message.caption = msg.get("caption")
message.thumb = msg.get("thumb")
message.sticker = msg.get("sticker")
chat.add_message(id, message)
data[jid] = chat
print(f"Importing chats from JSON...({index + 1}/{total_row_number})", end="\r")
def _merge_json_file(self, source_path: str, target_path: str, json_file: str) -> None:
"""Merge a single JSON file.
Args:
source_path: Path to source file.
target_path: Path to target file.
json_file: Name of the JSON file.
"""
logging.info(f"Merging '{json_file}' with existing file in target directory...", extra={"clear": True})
source_data = self._load_chat_data(source_path)
target_data = self._load_chat_data(target_path)
source_chats = self._parse_chats_from_json(source_data)
target_chats = self._parse_chats_from_json(target_data)
merged_chats = self._merge_chat_stores(source_chats, target_chats)
merged_data = self._serialize_chats(merged_chats)
if self._has_changes(merged_data, target_data):
logging.info(f"Changes detected in '{json_file}', updating target file...")
self._save_merged_data(target_path, merged_data)
else:
logging.info(f"No changes detected in '{json_file}', skipping update.")
def _should_copy_media_file(self, source_file: str, target_file: str) -> bool:
"""Check if media file should be copied.
Args:
source_file: Path to source media file.
target_file: Path to target media file.
Returns:
True if file should be copied, False otherwise.
"""
return not os.path.exists(target_file) or os.path.getmtime(source_file) > os.path.getmtime(target_file)
def _merge_media_directories(self, source_dir: str, target_dir: str, media_dir: str) -> None:
"""Merge media directories from source to target.
Args:
source_dir: Source directory path.
target_dir: Target directory path.
media_dir: Media directory name.
"""
source_media_path = os.path.join(source_dir, media_dir)
target_media_path = os.path.join(target_dir, media_dir)
logging.info(f"Merging media directories. Source: {source_media_path}, target: {target_media_path}")
if not os.path.exists(source_media_path):
return
for root, _, files in os.walk(source_media_path):
relative_path = os.path.relpath(root, source_media_path)
target_root = os.path.join(target_media_path, relative_path)
os.makedirs(target_root, exist_ok=True)
for file in files:
source_file = os.path.join(root, file)
target_file = os.path.join(target_root, file)
if self._should_copy_media_file(source_file, target_file):
logging.debug(f"Copying '{source_file}' to '{target_file}'...")
shutil.copy2(source_file, target_file)
def merge(self, source_dir: str, target_dir: str, media_dir: str) -> None:
"""Merge JSON files and media from source to target directory.
Args:
source_dir: The path to the source directory containing JSON files.
target_dir: The path to the target directory to merge into.
media_dir: The path to the media directory.
"""
json_files = self._get_json_files(source_dir)
logging.info("Starting incremental merge process...")
for json_file in json_files:
source_path = os.path.join(source_dir, json_file)
target_path = os.path.join(target_dir, json_file)
if not os.path.exists(target_path):
self._copy_new_file(source_path, target_path, target_dir, json_file)
else:
self._merge_json_file(source_path, target_path, json_file)
self._merge_media_directories(source_dir, target_dir, media_dir)
def sanitize_filename(file_name: str) -> str:
"""Sanitizes a filename by removing invalid and unsafe characters.
def incremental_merge(source_dir: str, target_dir: str, media_dir: str, pretty_print_json: int, avoid_encoding_json: bool) -> None:
"""Wrapper for merging JSON files from the source directory into the target directory.
Args:
file_name: The filename to sanitize.
Returns:
The sanitized filename.
source_dir: The path to the source directory containing JSON files.
target_dir: The path to the target directory to merge into.
media_dir: The path to the media directory.
pretty_print_json: JSON indentation level.
avoid_encoding_json: Whether to avoid ASCII encoding.
"""
return "".join(x for x in file_name if x.isalnum() or x in "- ")
merger = IncrementalMerger(pretty_print_json, avoid_encoding_json)
merger.merge(source_dir, target_dir, media_dir)
def get_file_name(contact: str, chat: ChatStore) -> Tuple[str, str]:
@@ -299,7 +512,7 @@ def get_file_name(contact: str, chat: ChatStore) -> Tuple[str, str]:
else:
name = phone_number
return sanitize_filename(file_name), name
return safe_name(file_name), name
def get_cond_for_empty(enable: bool, jid_field: str, broadcast_field: str) -> str:
@@ -316,9 +529,41 @@ def get_cond_for_empty(enable: bool, jid_field: str, broadcast_field: str) -> st
return f"AND (chat.hidden=0 OR {jid_field}='status@broadcast' OR {broadcast_field}>0)" if enable else ""
def get_chat_condition(filter: Optional[List[str]], include: bool, columns: List[str], jid: Optional[str] = None, platform: Optional[str] = None) -> str:
def _get_group_condition(jid: str, platform: str) -> str:
"""Generate platform-specific group identification condition.
Args:
jid: The JID column name.
platform: The platform ("android" or "ios").
Returns:
SQL condition string for group identification.
Raises:
ValueError: If platform is not supported.
"""
if platform == "android":
return f"{jid}.type == 1"
elif platform == "ios":
return f"{jid} IS NOT NULL"
else:
raise ValueError(
"Only android and ios are supported for argument platform if jid is not None")
def get_chat_condition(
filter: Optional[List[str]],
include: bool,
columns: List[str],
jid: Optional[str] = None,
platform: Optional[str] = None
) -> str:
"""Generates a SQL condition for filtering chats based on inclusion or exclusion criteria.
SQL injection risks from chat filters were evaluated during development and deemed negligible
due to the tool's offline, trusted-input model (user running this tool on WhatsApp
backups/databases on their own device).
Args:
filter: A list of phone numbers to include or exclude.
include: True to include chats that match the filter, False to exclude them.
@@ -332,29 +577,39 @@ def get_chat_condition(filter: Optional[List[str]], include: bool, columns: List
Raises:
ValueError: If the column count is invalid or an unsupported platform is provided.
"""
if filter is not None:
conditions = []
if len(columns) < 2 and jid is not None:
raise ValueError("There must be at least two elements in argument columns if jid is not None")
if jid is not None:
if platform == "android":
is_group = f"{jid}.type == 1"
elif platform == "ios":
is_group = f"{jid} IS NOT NULL"
else:
raise ValueError("Only android and ios are supported for argument platform if jid is not None")
for index, chat in enumerate(filter):
if include:
conditions.append(f"{' OR' if index > 0 else ''} {columns[0]} LIKE '%{chat}%'")
if len(columns) > 1:
conditions.append(f" OR ({columns[1]} LIKE '%{chat}%' AND {is_group})")
else:
conditions.append(f"{' AND' if index > 0 else ''} {columns[0]} NOT LIKE '%{chat}%'")
if len(columns) > 1:
conditions.append(f" AND ({columns[1]} NOT LIKE '%{chat}%' AND {is_group})")
return f"AND ({' '.join(conditions)})"
else:
if not filter:
return ""
if jid is not None and len(columns) < 2:
raise ValueError(
"There must be at least two elements in argument columns if jid is not None")
# Get group condition if needed
is_group_condition = None
if jid is not None:
is_group_condition = _get_group_condition(jid, platform)
# Build conditions for each chat filter
conditions = []
for index, chat in enumerate(filter):
# Add connector for subsequent conditions (with double space)
connector = " OR" if include else " AND"
prefix = connector if index > 0 else ""
# Primary column condition
operator = "LIKE" if include else "NOT LIKE"
conditions.append(f"{prefix} {columns[0]} {operator} '%{chat}%'")
# Secondary column condition for groups
if len(columns) > 1 and is_group_condition:
if include:
group_condition = f" OR ({columns[1]} {operator} '%{chat}%' AND {is_group_condition})"
else:
group_condition = f" AND ({columns[1]} {operator} '%{chat}%' AND {is_group_condition})"
conditions.append(group_condition)
combined_conditions = "".join(conditions)
return f"AND ({combined_conditions})"
# Android Specific
@@ -365,6 +620,7 @@ CRYPT14_OFFSETS = (
{"iv": 67, "db": 193},
{"iv": 67, "db": 194},
{"iv": 67, "db": 158},
{"iv": 67, "db": 196},
)
@@ -446,7 +702,7 @@ def determine_metadata(content: sqlite3.Row, init_msg: Optional[str]) -> Optiona
else:
msg = f"{old} changed their number to {new}"
elif content["action_type"] == 46:
return # Voice message in PM??? Seems no need to handle.
return # Voice message in PM??? Seems no need to handle.
elif content["action_type"] == 47:
msg = "The contact is an official business account"
elif content["action_type"] == 50:
@@ -459,11 +715,12 @@ def determine_metadata(content: sqlite3.Row, init_msg: Optional[str]) -> Optiona
else:
msg = "The security code in this chat changed"
elif content["action_type"] == 58:
msg = "You blocked this contact"
msg = "You blocked/unblocked this contact"
elif content["action_type"] == 67:
return # (PM) this contact use secure service from Facebook???
elif content["action_type"] == 69:
return # (PM) this contact use secure service from Facebook??? What's the difference with 67????
# (PM) this contact use secure service from Facebook??? What's the difference with 67????
return
else:
return # Unsupported
return msg
@@ -490,8 +747,73 @@ def get_status_location(output_folder: str, offline_static: str) -> str:
w3css_path = os.path.join(static_folder, "w3.css")
if not os.path.isfile(w3css_path):
with urllib.request.urlopen(w3css) as resp:
with open(w3css_path, "wb") as f: f.write(resp.read())
with open(w3css_path, "wb") as f:
f.write(resp.read())
w3css = os.path.join(offline_static, "w3.css")
return w3css
def check_jid_map(db: sqlite3.Connection) -> bool:
"""
Checks if the jid_map table exists in the database.
Args:
db (sqlite3.Connection): The SQLite database connection.
Returns:
bool: True if the jid_map table exists, False otherwise.
"""
cursor = db.cursor()
cursor.execute("SELECT name FROM sqlite_master WHERE type='table' AND name='jid_map'")
return cursor.fetchone() is not None
def get_jid_map_join(jid_map_exists: bool) -> str:
"""
Returns the SQL JOIN statements for jid_map table.
"""
if not jid_map_exists:
return ""
else:
return """LEFT JOIN jid_map as jid_map_global
ON chat.jid_row_id = jid_map_global.lid_row_id
LEFT JOIN jid lid_global
ON jid_map_global.jid_row_id = lid_global._id
LEFT JOIN jid_map as jid_map_group
ON message.sender_jid_row_id = jid_map_group.lid_row_id
LEFT JOIN jid lid_group
ON jid_map_group.jid_row_id = lid_group._id"""
def get_jid_map_selection(jid_map_exists: bool) -> tuple:
"""
Returns the SQL selection statements for jid_map table.
"""
if not jid_map_exists:
return "jid_global.raw_string", "jid_group.raw_string"
else:
return (
"COALESCE(lid_global.raw_string, jid_global.raw_string)",
"COALESCE(lid_group.raw_string, jid_group.raw_string)"
)
def get_transcription_selection(db: sqlite3.Connection) -> str:
"""
Returns the SQL selection statement for transcription text based on the database schema.
Args:
db (sqlite3.Connection): The SQLite database connection.
Returns:
str: The SQL selection statement for transcription.
"""
cursor = db.cursor()
cursor.execute("PRAGMA table_info(message_media)")
columns = [row[1] for row in cursor.fetchall()]
if "raw_transcription_text" in columns:
return "message_media.raw_transcription_text AS transcription_text"
else:
return "NULL AS transcription_text"
def setup_template(template: Optional[str], no_avatar: bool, experimental: bool = False) -> jinja2.Template:
@@ -521,43 +843,137 @@ def setup_template(template: Optional[str], no_avatar: bool, experimental: bool
template_env.filters['sanitize_except'] = sanitize_except
return template_env.get_template(template_file)
# iOS Specific
APPLE_TIME = 978307200
def slugify(value: str, allow_unicode: bool = False) -> str:
def safe_name(text: Union[str, bytes]) -> str:
"""
Convert text to ASCII-only slugs for URL-safe strings.
Taken from https://github.com/django/django/blob/master/django/utils/text.py
Sanitize the input text and generates a safe file name.
This function serves a similar purpose to slugify() from
Django previously used in this project, but is a clean-room
Reimplementation tailored for performance and a narrower
Use case for this project. Licensed under the same terms
As the project (MIT).
Args:
value (str): The string to convert to a slug.
allow_unicode (bool, optional): Whether to allow Unicode characters. Defaults to False.
text (str|bytes): The string to be sanitized.
Returns:
str: The slugified string with only alphanumerics, underscores, or hyphens.
str: The sanitized string with only alphanumerics, underscores, or hyphens.
"""
value = str(value)
if allow_unicode:
value = unicodedata.normalize('NFKC', value)
else:
value = unicodedata.normalize('NFKD', value).encode('ascii', 'ignore').decode('ascii')
value = re.sub(r'[^\w\s-]', '', value.lower())
return re.sub(r'[-\s]+', '-', value).strip('-_')
if isinstance(text, bytes):
text = text.decode("utf-8", "ignore")
elif not isinstance(text, str):
raise TypeError("value must be a string or bytes")
normalized_text = unicodedata.normalize("NFKC", text)
safe_chars = [char for char in normalized_text if char.isalnum() or char in "-_ ."]
return "-".join(''.join(safe_chars).split())
def get_from_string(msg: Dict, chat_id: str) -> str:
"""Return the number or name for the sender"""
if msg["from_me"]:
return "Me"
if msg["sender"]:
return str(msg["sender"])
return str(chat_id)
def get_chat_type(chat_id: str) -> str:
"""Return the chat type based on the whatsapp id"""
if chat_id == "000000000000000":
return "calls"
elif chat_id.endswith("@s.whatsapp.net"):
return "personal_chat"
elif chat_id.endswith("@g.us"):
return "private_group"
elif chat_id == "status@broadcast":
return "status_broadcast"
elif chat_id.endswith("@broadcast"):
return "broadcast_channel"
logging.warning(f"Unknown chat type for {chat_id}, defaulting to private_group")
return "private_group"
def get_from_id(msg: Dict, chat_id: str) -> str:
"""Return the user id for the sender"""
if msg["from_me"]:
return "user00000"
if msg["sender"]:
return "user" + msg["sender"]
return f"user{chat_id}"
def get_reply_id(data: Dict, reply_key: int) -> Optional[int]:
"""Get the id of the message corresponding to the reply"""
if not reply_key:
return None
for msg_id, msg in data["messages"].items():
if msg["key_id"] == reply_key:
return msg_id
return None
def telegram_json_format(jik: str, data: Dict, timezone_offset) -> Dict:
"""Convert the data to the Telegram export format"""
timing = Timing(timezone_offset or CURRENT_TZ_OFFSET)
try:
chat_id = int(''.join([c for c in jik if c.isdigit()]))
except ValueError:
# not a real chat: e.g. statusbroadcast
chat_id = 0
json_obj = {
"name": data["name"] if data["name"] else jik,
"type": get_chat_type(jik),
"id": chat_id,
"messages": [ {
"id": int(msgId),
"type": "message",
"date": timing.format_timestamp(msg["timestamp"], "%Y-%m-%dT%H:%M:%S"),
"date_unixtime": int(msg["timestamp"]),
"from": get_from_string(msg, chat_id),
"from_id": get_from_id(msg, chat_id),
"reply_to_message_id": get_reply_id(data, msg["reply"]),
"text": msg["data"],
"text_entities": [
{
# TODO this will lose formatting and different types
"type": "plain",
"text": msg["data"],
}
],
}
for msgId, msg in data["messages"].items()]
}
# remove empty messages and replies
for msg_id, msg in enumerate(json_obj["messages"]):
if not msg["reply_to_message_id"]:
del json_obj["messages"][msg_id]["reply_to_message_id"]
json_obj["messages"] = [m for m in json_obj["messages"] if m["text"]]
return json_obj
class WhatsAppIdentifier(StrEnum):
MESSAGE = "7c7fba66680ef796b916b067077cc246adacf01d" # AppDomainGroup-group.net.whatsapp.WhatsApp.shared-ChatStorage.sqlite
CONTACT = "b8548dc30aa1030df0ce18ef08b882cf7ab5212f" # AppDomainGroup-group.net.whatsapp.WhatsApp.shared-ContactsV2.sqlite
CALL = "1b432994e958845fffe8e2f190f26d1511534088" # AppDomainGroup-group.net.whatsapp.WhatsApp.shared-CallHistory.sqlite
# AppDomainGroup-group.net.whatsapp.WhatsApp.shared-ChatStorage.sqlite
MESSAGE = "7c7fba66680ef796b916b067077cc246adacf01d"
# AppDomainGroup-group.net.whatsapp.WhatsApp.shared-ContactsV2.sqlite
CONTACT = "b8548dc30aa1030df0ce18ef08b882cf7ab5212f"
# AppDomainGroup-group.net.whatsapp.WhatsApp.shared-CallHistory.sqlite
CALL = "1b432994e958845fffe8e2f190f26d1511534088"
DOMAIN = "AppDomainGroup-group.net.whatsapp.WhatsApp.shared"
class WhatsAppBusinessIdentifier(StrEnum):
MESSAGE = "724bd3b98b18518b455a87c1f3ac3a0d189c4466" # AppDomainGroup-group.net.whatsapp.WhatsAppSMB.shared-ChatStorage.sqlite
CONTACT = "d7246a707f51ddf8b17ee2dddabd9e0a4da5c552" # AppDomainGroup-group.net.whatsapp.WhatsAppSMB.shared-ContactsV2.sqlite
CALL = "b463f7c4365eefc5a8723930d97928d4e907c603" # AppDomainGroup-group.net.whatsapp.WhatsAppSMB.shared-CallHistory.sqlite
DOMAIN = "AppDomainGroup-group.net.whatsapp.WhatsAppSMB.shared"
# AppDomainGroup-group.net.whatsapp.WhatsAppSMB.shared-ChatStorage.sqlite
MESSAGE = "724bd3b98b18518b455a87c1f3ac3a0d189c4466"
# AppDomainGroup-group.net.whatsapp.WhatsAppSMB.shared-ContactsV2.sqlite
CONTACT = "d7246a707f51ddf8b17ee2dddabd9e0a4da5c552"
# AppDomainGroup-group.net.whatsapp.WhatsAppSMB.shared-CallHistory.sqlite
CALL = "b463f7c4365eefc5a8723930d97928d4e907c603"
DOMAIN = "AppDomainGroup-group.net.whatsapp.WhatsAppSMB.shared"
class JidType(IntEnum):
PM = 0

View File

@@ -1,5 +1,11 @@
import vobject
import logging
import re
import quopri
from typing import List, TypedDict
from Whatsapp_Chat_Exporter.data_model import ChatStore
from Whatsapp_Chat_Exporter.utility import Device
class ExportedContactNumbers(TypedDict):
@@ -21,32 +27,155 @@ class ContactsFromVCards:
for number, name in self.contact_mapping:
# short number must be a bad contact, lets skip it
if len(number) <= 5:
continue
chats_search = filter_chats_by_prefix(chats, number).values()
if chats_search:
for chat in chats_search:
if not hasattr(chat, 'name') or (hasattr(chat, 'name') and chat.name is None):
setattr(chat, 'name', name)
else:
chats.add_chat(number + "@s.whatsapp.net", ChatStore(Device.ANDROID, name))
def decode_quoted_printable(value: str, charset: str) -> str:
"""Decode a vCard value that may be quoted-printable UTF-8."""
try:
bytes_val = quopri.decodestring(value)
return bytes_val.decode(charset, errors="replace")
except Exception:
# Fallback: return the original value if decoding fails
logging.warning(
f"Failed to decode quoted-printable value: {value}, "
f"charset: {charset}. Please report this issue."
)
return value
def _parse_vcard_line(line: str) -> tuple[str, dict[str, str], str] | None:
"""
Parses a single vCard property line into its components:
Property Name, Parameters (as a dict), and Value.
Example: 'FN;CHARSET=UTF-8:John Doe' -> ('FN', {'CHARSET': 'UTF-8'}, 'John Doe')
"""
# Find the first colon, which separates the property/parameters from the value.
colon_index = line.find(':')
if colon_index == -1:
return None # Invalid vCard line format
prop_and_params = line[:colon_index].strip()
value = line[colon_index + 1:].strip()
# Split property name from parameters
parts = prop_and_params.split(';')
property_name = parts[0].upper()
parameters = {}
for part in parts[1:]:
if '=' in part:
key, val = part.split('=', 1)
parameters[key.upper()] = val.strip('"') # Remove potential quotes from value
return property_name, parameters, value
def get_vcard_value(entry: str, field_name: str) -> list[str]:
"""
Scans the vCard entry for lines starting with the specific field_name
and returns a list of its decoded values, handling parameters like
ENCODING and CHARSET.
"""
target_name = field_name.upper()
cached_line = ""
charset = "utf-8"
values = []
for line in entry.splitlines():
line = line.strip()
if cached_line:
if line.endswith('='):
cached_line += line[:-1]
continue # Wait for the next line to complete the value
values.append(decode_quoted_printable(cached_line + line, charset))
cached_line = ""
else:
# Skip empty lines or lines that don't start with the target field (after stripping)
if not line or not line.upper().startswith(target_name):
continue
for chat in filter_chats_by_prefix(chats, number).values():
if not hasattr(chat, 'name') or (hasattr(chat, 'name') and chat.name is None):
setattr(chat, 'name', name)
parsed = _parse_vcard_line(line)
if parsed is None:
continue
prop_name, params, raw_value = parsed
if prop_name != target_name:
continue
encoding = params.get('ENCODING')
charset = params.get('CHARSET', 'utf-8')
# Apply decoding if ENCODING parameter is present
if encoding == 'QUOTED-PRINTABLE':
if raw_value.endswith('='):
# Handle soft line breaks in quoted-printable and cache the line
cached_line += raw_value[:-1]
continue # Wait for the next line to complete the value
values.append(decode_quoted_printable(raw_value, charset))
elif encoding:
raise NotImplementedError(f"Encoding '{encoding}' not supported yet.")
else:
values.append(raw_value)
return values
def process_vcard_entry(entry: str) -> dict | bool:
"""
Process a vCard entry using pure string manipulation
Args:
entry: A string containing a single vCard block.
Returns:
A dictionary of the extracted data or False if required fields are missing.
"""
name = None
# Extract name in priority: FN -> N -> ORG
for field in ("FN", "N", "ORG"):
if name_values := get_vcard_value(entry, field):
name = name_values[0].replace(';', ' ') # Simple cleanup for structured name
break
if not name:
return False
numbers = get_vcard_value(entry, "TEL")
if not numbers:
return False
return {
"full_name": name,
# Remove duplications
"numbers": set(numbers),
}
def read_vcards_file(vcf_file_path, default_country_code: str):
contacts = []
with open(vcf_file_path, mode="r", encoding="utf-8") as f:
reader = vobject.readComponents(f)
for row in reader:
if hasattr(row, 'fn'):
name = str(row.fn.value)
elif hasattr(row, 'n'):
name = str(row.n.value)
else:
name = None
if not hasattr(row, 'tel') or name is None:
continue
contact: ExportedContactNumbers = {
"full_name": name,
"numbers": list(map(lambda tel: tel.value, row.tel_list)),
}
with open(vcf_file_path, "r", encoding="utf-8", errors="ignore") as f:
content = f.read()
# Split into individual vCards
vcards = content.split("BEGIN:VCARD")
for vcard in vcards:
if "END:VCARD" not in vcard:
continue
if contact := process_vcard_entry(vcard):
contacts.append(contact)
logging.info(f"Imported {len(contacts)} contacts/vcards")
return map_number_to_name(contacts, default_country_code)
@@ -77,6 +206,6 @@ def normalize_number(number: str, country_code: str):
return number[len(starting_char):]
# leading zero should be removed
if starting_char == '0':
if number.startswith('0'):
number = number[1:]
return country_code + number # fall back

View File

@@ -1,20 +0,0 @@
# from contacts_names_from_vcards import readVCardsFile
from Whatsapp_Chat_Exporter.vcards_contacts import normalize_number, read_vcards_file
def test_readVCardsFile():
assert len(read_vcards_file("contacts.vcf", "973")) > 0
def test_create_number_to_name_dicts():
pass
def test_fuzzy_match_numbers():
pass
def test_normalize_number():
assert normalize_number('0531234567', '1') == '1531234567'
assert normalize_number('001531234567', '2') == '1531234567'
assert normalize_number('+1531234567', '34') == '1531234567'
assert normalize_number('053(123)4567', '34') == '34531234567'
assert normalize_number('0531-234-567', '58') == '58531234567'

View File

@@ -1,329 +1,657 @@
<!DOCTYPE html>
<html>
<head>
<title>Whatsapp - {{ name }}</title>
<meta charset="UTF-8">
<link rel="stylesheet" href="{{w3css}}">
<style>
html, body {
font-size: 12px;
scroll-behavior: smooth;
}
header {
position: fixed;
z-index: 20;
border-bottom: 2px solid #e3e6e7;
font-size: 2em;
font-weight: bolder;
background-color: white;
padding: 20px 0 20px 0;
}
footer {
border-top: 2px solid #e3e6e7;
padding: 20px 0 20px 0;
}
article {
width:500px;
margin:100px auto;
z-index:10;
font-size: 15px;
word-wrap: break-word;
}
img, video {
max-width:100%;
}
div.reply{
font-size: 13px;
text-decoration: none;
}
div:target::before {
content: '';
display: block;
height: 115px;
margin-top: -115px;
visibility: hidden;
}
div:target {
border-style: solid;
border-width: 2px;
animation: border-blink 0.5s steps(1) 5;
border-color: rgba(0,0,0,0)
}
table {
width: 100%;
}
@keyframes border-blink {
0% {
border-color: #2196F3;
}
50% {
border-color: rgba(0,0,0,0);
}
}
.avatar {
border-radius:50%;
overflow:hidden;
max-width: 64px;
max-height: 64px;
}
.name {
color: #3892da;
}
.pad-left-10 {
padding-left: 10px;
}
.pad-right-10 {
padding-right: 10px;
}
.reply_link {
color: #168acc;
}
.blue {
color: #70777a;
}
.sticker {
max-width: 100px !important;
max-height: 100px !important;
}
</style>
<base href="{{ media_base }}" target="_blank">
</head>
<body>
<header class="w3-center w3-top">
{{ headline }}
{% if status is not none %}
<br>
<span class="w3-small">{{ status }}</span>
{% endif %}
</header>
<article class="w3-container">
<div class="table">
{% set last = {'last': 946688461.001} %}
{% for msg in msgs -%}
<div class="w3-row w3-padding-small w3-margin-bottom" id="{{ msg.key_id }}">
{% if determine_day(last.last, msg.timestamp) is not none %}
<div class="w3-center w3-padding-16 blue">{{ determine_day(last.last, msg.timestamp) }}</div>
{% if last.update({'last': msg.timestamp}) %}{% endif %}
{% endif %}
{% if msg.from_me == true %}
<div class="w3-row">
<div class="w3-left blue">{{ msg.time }}</div>
<div class="name w3-right-align pad-left-10">You</div>
</div>
<div class="w3-row">
{% if not no_avatar and my_avatar is not none %}
<div class="w3-col m10 l10">
{% else %}
<div class="w3-col m12 l12">
{% endif %}
<div class="w3-right-align">
{% if msg.reply is not none %}
<div class="reply">
<span class="blue">Replying to </span>
<a href="#{{msg.reply}}" target="_self" class="reply_link no-base">
{% if msg.quoted_data is not none %}
"{{msg.quoted_data}}"
{% else %}
this message
{% endif %}
</a>
</div>
{% endif %}
{% if msg.meta == true or msg.media == false and msg.data is none %}
<div class="w3-panel w3-border-blue w3-pale-blue w3-rightbar w3-leftbar w3-threequarter w3-center">
{% if msg.safe %}
<p>{{ msg.data | safe or 'Not supported WhatsApp internal message' }}</p>
{% else %}
<p>{{ msg.data or 'Not supported WhatsApp internal message' }}</p>
{% endif %}
</div>
{% if msg.caption is not none %}
<div class="w3-container">
{{ msg.caption | urlize(none, true, '_blank') }}
</div>
{% endif %}
{% else %}
{% if msg.media == false %}
{{ msg.data | sanitize_except() | urlize(none, true, '_blank') }}
{% else %}
{% if "image/" in msg.mime %}
<a href="{{ msg.data }}">
<img src="{{ msg.thumb if msg.thumb is not none else msg.data }}" {{ 'class="sticker"' | safe if msg.sticker }} loading="lazy"/>
</a>
{% elif "audio/" in msg.mime %}
<audio controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</audio>
{% elif "video/" in msg.mime %}
<video class="lazy" autobuffer {% if msg.message_type|int == 13 or msg.message_type|int == 11 %}autoplay muted loop playsinline{%else%}controls{% endif %}>
<source type="{{ msg.mime }}" data-src="{{ msg.data }}" />
</video>
{% elif "/" in msg.mime %}
<div class="w3-panel w3-border-blue w3-pale-blue w3-rightbar w3-leftbar w3-threequarter w3-center">
<p>The file cannot be displayed here, however it should be located at <a href="./{{ msg.data }}">here</a></p>
</div>
{% else %}
{% filter escape %}{{ msg.data }}{% endfilter %}
{% endif %}
{% if msg.caption is not none %}
<div class="w3-container">
{{ msg.caption | urlize(none, true, '_blank') }}
</div>
{% endif %}
{% endif %}
{% endif %}
</div>
</div>
{% if not no_avatar and my_avatar is not none %}
<div class="w3-col m2 l2 pad-left-10">
<a href="{{ my_avatar }}">
<img src="{{ my_avatar }}" onerror="this.style.display='none'" class="avatar" loading="lazy">
</a>
</div>
{% endif %}
</div>
{% else %}
<div class="w3-row">
<div class="w3-left pad-right-10 name">
{% if msg.sender is not none %}
{{ msg.sender }}
{% else %}
{{ name }}
{% endif %}
</div>
<div class="w3-right-align blue">{{ msg.time }}</div>
</div>
<div class="w3-row">
{% if not no_avatar %}
<div class="w3-col m2 l2">
{% if their_avatar is not none %}
<a href="{{ their_avatar }}"><img src="{{ their_avatar_thumb or '' }}" onerror="this.style.display='none'" class="avatar" loading="lazy"></a>
{% else %}
<img src="{{ their_avatar_thumb or '' }}" onerror="this.style.display='none'" class="avatar" loading="lazy">
{% endif %}
</div>
<div class="w3-col m10 l10">
{% else %}
<div class="w3-col m12 l12">
{% endif %}
<div class="w3-left-align">
{% if msg.reply is not none %}
<div class="reply">
<span class="blue">Replying to </span>
<a href="#{{msg.reply}}" target="_self" class="reply_link no-base">
{% if msg.quoted_data is not none %}
"{{msg.quoted_data}}"
{% else %}
this message
{% endif %}
</a>
</div>
{% endif %}
{% if msg.meta == true or msg.media == false and msg.data is none %}
<div class="w3-panel w3-border-blue w3-pale-blue w3-rightbar w3-leftbar w3-threequarter w3-center">
{% if msg.safe %}
<p>{{ msg.data | safe or 'Not supported WhatsApp internal message' }}</p>
{% else %}
<p>{{ msg.data or 'Not supported WhatsApp internal message' }}</p>
{% endif %}
</div>
{% if msg.caption is not none %}
<div class="w3-container">
{{ msg.caption | urlize(none, true, '_blank') }}
</div>
{% endif %}
{% else %}
{% if msg.media == false %}
{{ msg.data | sanitize_except() | urlize(none, true, '_blank') }}
{% else %}
{% if "image/" in msg.mime %}
<a href="{{ msg.data }}">
<img src="{{ msg.thumb if msg.thumb is not none else msg.data }}" {{ 'class="sticker"' | safe if msg.sticker }} loading="lazy"/>
</a>
{% elif "audio/" in msg.mime %}
<audio controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</audio>
{% elif "video/" in msg.mime %}
<video class="lazy" autobuffer {% if msg.message_type|int == 13 or msg.message_type|int == 11 %}autoplay muted loop playsinline{%else%}controls{% endif %}>
<source type="{{ msg.mime }}" data-src="{{ msg.data }}" />
</video>
{% elif "/" in msg.mime %}
<div class="w3-panel w3-border-blue w3-pale-blue w3-rightbar w3-leftbar w3-threequarter w3-center">
<p>The file cannot be displayed here, however it should be located at <a href="./{{ msg.data }}">here</a></p>
</div>
{% else %}
{% filter escape %}{{ msg.data }}{% endfilter %}
{% endif %}
{% if msg.caption is not none %}
<div class="w3-container">
{{ msg.caption | urlize(none, true, '_blank') }}
</div>
{% endif %}
{% endif %}
{% endif %}
</div>
</div>
</div>
{% endif %}
</div>
{% endfor %}
</div>
</article>
<footer class="w3-center">
<h2>
{% if previous %}
<a href="./{{ previous }}" target="_self">Previous</a>
{% endif %}
<h2>
{% if next %}
<a href="./{{ next }}" target="_self">Next</a>
{% else %}
End of History
{% endif %}
</h2>
<br>
Portions of this page are reproduced from <a href="https://web.dev/articles/lazy-loading-video">work</a> created and <a href="https://developers.google.com/readme/policies">shared by Google</a> and used according to terms described in the <a href="https://www.apache.org/licenses/LICENSE-2.0">Apache 2.0 License</a>.
</footer>
<script>
document.addEventListener("DOMContentLoaded", function() {
var lazyVideos = [].slice.call(document.querySelectorAll("video.lazy"));
<head>
<title>Whatsapp - {{ name }}</title>
<meta charset="UTF-8">
<script src="https://cdn.tailwindcss.com"></script>
<script>
tailwind.config = {
theme: {
extend: {
colors: {
whatsapp: {
light: '#e7ffdb',
DEFAULT: '#25D366',
dark: '#075E54',
chat: '#efeae2',
'chat-light': '#f0f2f5',
}
}
}
}
}
</script>
<style>
body, html {
height: 100%;
margin: 0;
padding: 0;
scroll-behavior: smooth !important;
}
.chat-list {
height: calc(100vh - 120px);
overflow-y: auto;
}
.message-list {
height: calc(100vh - 90px);
overflow-y: auto;
}
@media (max-width: 640px) {
.chat-list, .message-list {
height: calc(100vh - 108px);
}
}
header {
position: fixed;
z-index: 20;
border-bottom: 2px solid #e3e6e7;
font-size: 2em;
font-weight: bolder;
background-color: white;
padding: 20px 0 20px 0;
}
footer {
margin-top: 10px;
border-top: 2px solid #e3e6e7;
padding: 20px 0 20px 0;
}
article {
width:430px;
margin: auto;
z-index:10;
font-size: 15px;
word-wrap: break-word;
}
img, video, audio{
max-width:100%;
box-sizing: border-box;
}
div.reply{
font-size: 13px;
text-decoration: none;
}
div:target::before {
content: '';
display: block;
height: 115px;
margin-top: -115px;
visibility: hidden;
}
div:target {
animation: 3s highlight;
}
.avatar {
border-radius:50%;
overflow:hidden;
max-width: 64px;
max-height: 64px;
}
.name {
color: #3892da;
}
.pad-left-10 {
padding-left: 10px;
}
.pad-right-10 {
padding-right: 10px;
}
.reply_link {
color: #168acc;
}
.blue {
color: #70777a;
}
.sticker {
max-width: 100px !important;
max-height: 100px !important;
}
@keyframes highlight {
from {
background-color: rgba(37, 211, 102, 0.1);
}
to {
background-color: transparent;
}
}
.search-input {
transform: translateY(-100%);
transition: transform 0.3s ease-in-out;
}
.search-input.active {
transform: translateY(0);
}
.reply-box:active {
background-color:rgb(200 202 205 / var(--tw-bg-opacity, 1));
}
.info-box-tooltip {
--tw-translate-x: -50%;
transform: translate(var(--tw-translate-x), var(--tw-translate-y)) rotate(var(--tw-rotate)) skewX(var(--tw-skew-x)) skewY(var(--tw-skew-y)) scaleX(var(--tw-scale-x)) scaleY(var(--tw-scale-y));
}
if ("IntersectionObserver" in window) {
var lazyVideoObserver = new IntersectionObserver(function(entries, observer) {
entries.forEach(function(video) {
if (video.isIntersecting) {
for (var source in video.target.children) {
var videoSource = video.target.children[source];
if (typeof videoSource.tagName === "string" && videoSource.tagName === "SOURCE") {
videoSource.src = videoSource.dataset.src;
}
}
.status-indicator {
display: inline-block;
margin-left: 4px;
font-size: 0.8em;
color: #8c8c8c;
}
video.target.load();
video.target.classList.remove("lazy");
lazyVideoObserver.unobserve(video.target);
}
});
});
.status-indicator.read {
color: #34B7F1;
}
lazyVideos.forEach(function(lazyVideo) {
lazyVideoObserver.observe(lazyVideo);
});
}
});
</script>
<script>
// Prevent the <base> tag from affecting links with the class "no-base"
document.querySelectorAll('.no-base').forEach(link => {
link.addEventListener('click', function(event) {
const href = this.getAttribute('href');
if (href.startsWith('#')) {
window.location.hash = href;
event.preventDefault();
}
});
});
</script>
</body>
.play-icon {
width: 0;
height: 0;
border-left: 8px solid white;
border-top: 5px solid transparent;
border-bottom: 5px solid transparent;
filter: drop-shadow(0 1px 2px rgba(0, 0, 0, 0.3));
}
.speaker-icon {
position: relative;
width: 8px;
height: 6px;
background: #666;
border-radius: 1px 0 0 1px;
}
.speaker-icon::before {
content: '';
position: absolute;
right: -4px;
top: -1px;
width: 0;
height: 0;
border-left: 4px solid #666;
border-top: 4px solid transparent;
border-bottom: 4px solid transparent;
}
.speaker-icon::after {
content: '';
position: absolute;
right: -8px;
top: -3px;
width: 8px;
height: 12px;
border: 2px solid #666;
border-left: none;
border-radius: 0 8px 8px 0;
}
.search-icon {
width: 20px;
height: 20px;
position: relative;
display: inline-block;
}
.search-icon::before {
content: '';
position: absolute;
width: 12px;
height: 12px;
border: 2px solid #aebac1;
border-radius: 50%;
top: 2px;
left: 2px;
}
.search-icon::after {
content: '';
position: absolute;
width: 2px;
height: 6px;
background: #aebac1;
transform: rotate(45deg);
top: 12px;
left: 12px;
}
.arrow-left {
width: 0;
height: 0;
border-top: 6px solid transparent;
border-bottom: 6px solid transparent;
border-right: 8px solid #aebac1;
display: inline-block;
}
.arrow-right {
width: 0;
height: 0;
border-top: 6px solid transparent;
border-bottom: 6px solid transparent;
border-left: 8px solid #aebac1;
display: inline-block;
}
.info-icon {
width: 20px;
height: 20px;
border: 2px solid currentColor;
border-radius: 50%;
position: relative;
display: inline-block;
}
.info-icon::before {
content: 'i';
position: absolute;
top: 50%;
left: 50%;
transform: translate(-50%, -50%);
font-size: 12px;
font-weight: bold;
font-style: normal;
}
</style>
<script>
function search(event) {
keywords = document.getElementById("mainHeaderSearchInput").value;
hits = [];
document.querySelectorAll(".message-text").forEach(elem => {
if (elem.innerText.trim().includes(keywords)){
hits.push(elem.parentElement.parentElement.id);
}
})
console.log(hits);
}
</script>
<base href="{{ media_base }}" target="_blank">
</head>
<body>
<article class="h-screen bg-whatsapp-chat-light">
<div class="w-full flex flex-col">
<div class="p-3 bg-whatsapp-dark flex items-center justify-between border-l border-[#d1d7db]">
<div class="flex items-center">
{% if not no_avatar %}
<div class="w3-col m2 l2">
{% if their_avatar is not none %}
<a href="{{ their_avatar }}"><img src="{{ their_avatar_thumb or '' }}" onerror="this.style.display='none'" class="w-10 h-10 rounded-full mr-3" loading="lazy"></a>
{% else %}
<img src="{{ their_avatar_thumb or '' }}" onerror="this.style.display='none'" class="w-10 h-10 rounded-full mr-3" loading="lazy">
{% endif %}
</div>
{% endif %}
<div>
<h2 class="text-white font-medium">{{ headline }}</h2>
{% if status is not none %}<p class="text-[#8696a0] text-xs">{{ status }}</p>{% endif %}
</div>
</div>
<div class="flex space-x-4">
<!-- <button id="searchButton">
<span class="search-icon"></span>
</button> -->
<!-- <span class="arrow-left"></span> -->
{% if previous %}
<a href="./{{ previous }}" target="_self">
<span class="arrow-left"></span>
</a>
{% endif %}
{% if next %}
<a href="./{{ next }}" target="_self">
<span class="arrow-right"></span>
</a>
{% endif %}
</div>
<!-- Search Input Overlay -->
<div id="mainSearchInput" class="search-input absolute article top-0 bg-whatsapp-dark p-3 flex items-center space-x-3">
<button id="closeMainSearch" class="text-[#aebac1]">
<span class="arrow-left"></span>
</button>
<input type="text" placeholder="Search..." class="flex-1 bg-[#1f2c34] text-white rounded-lg px-3 py-1 focus:outline-none" id="mainHeaderSearchInput" onkeyup="search(event)">
</div>
</div>
</div>
<div class="flex-1 p-5 message-list">
<div class="flex flex-col space-y-2">
<!--Date-->
{% set last = {'last': 946688461.001} %}
{% for msg in msgs -%}
{% if determine_day(last.last, msg.timestamp) is not none %}
<div class="flex justify-center">
<div class="bg-[#e1f2fb] rounded-lg px-2 py-1 text-xs text-[#54656f]">
{{ determine_day(last.last, msg.timestamp) }}
</div>
</div>
{% if last.update({'last': msg.timestamp}) %}{% endif %}
{% endif %}
<!--Actual messages-->
{% if msg.from_me == true %}
<div class="flex justify-end items-center group" id="{{ msg.key_id }}">
<div class="opacity-0 group-hover:opacity-100 transition-opacity duration-200 relative mr-2">
<div class="relative">
<div class="relative group/tooltip">
<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5 text-[#8696a0] hover:text-[#54656f] cursor-pointer" fill="none" viewBox="0 0 24 24" stroke="currentColor">
<use href="#info-icon"></use>
</svg>
<div class="absolute bottom-full info-box-tooltip mb-2 hidden group-hover/tooltip:block z-50">
<div class="bg-black text-white text-xs rounded py-1 px-2 whitespace-nowrap">
Delivered at {{msg.received_timestamp or 'unknown'}}
{% if msg.read_timestamp is not none %}
<br>Read at {{ msg.read_timestamp }}
{% endif %}
</div>
<div class="absolute top-full right-3 -mt-1 border-4 border-transparent border-t-black"></div>
</div>
</div>
</div>
</div>
<div class="bg-whatsapp-light rounded-lg p-2 max-w-[80%] shadow-sm relative {% if msg.reactions %}mb-2{% endif %}">
{% if msg.reply is not none %}
<a href="#{{msg.reply}}" target="_self" class="no-base">
<div
class="mb-2 p-1 bg-whatsapp-chat-light rounded border-l-4 border-whatsapp text-sm reply-box">
<div class="flex items-center gap-2">
<div class="flex-1 overflow-hidden">
<p class="text-whatsapp font-medium text-xs">Replying to</p>
<p class="text-[#111b21] text-xs truncate">
{% if msg.quoted_data is not none %}
"{{msg.quoted_data}}"
{% else %}
this message
{% endif %}
</p>
</div>
{% set replied_msg = msgs | selectattr('key_id', 'equalto', msg.reply) | first %}
{% if replied_msg and replied_msg.media == true %}
<div class="flex-shrink-0">
{% if "image/" in replied_msg.mime %}
<img src="{{ replied_msg.thumb if replied_msg.thumb is not none else replied_msg.data }}"
class="w-8 h-8 rounded object-cover" loading="lazy" />
{% elif "video/" in replied_msg.mime %}
<div class="relative w-8 h-8 rounded overflow-hidden bg-gray-200">
<img src="{{ replied_msg.thumb if replied_msg.thumb is not none else replied_msg.data }}"
class="w-full h-full object-cover" loading="lazy" />
<div class="absolute inset-0 flex items-center justify-center">
<div class="play-icon"></div>
</div>
</div>
{% elif "audio/" in replied_msg.mime %}
<div class="w-8 h-8 rounded bg-gray-200 flex items-center justify-center">
<div class="speaker-icon"></div>
</div>
{% endif %}
</div>
{% endif %}
</div>
</div>
</a>
{% endif %}
<p class="text-[#111b21] text-sm message-text">
{% if msg.meta == true or msg.media == false and msg.data is none %}
<div class="flex justify-center mb-2">
<div class="bg-[#FFF3C5] rounded-lg px-3 py-2 text-sm text-[#856404] flex items-center">
{% if msg.safe %}
{{ msg.data | safe or 'Not supported WhatsApp internal message' }}
{% else %}
{{ msg.data or 'Not supported WhatsApp internal message' }}
{% endif %}
</div>
</div>
{% if msg.caption is not none %}
<p>{{ msg.caption | urlize(none, true, '_blank') }}</p>
{% endif %}
{% else %}
{% if msg.media == false %}
{{ msg.data | sanitize_except() | urlize(none, true, '_blank') }}
{% else %}
{% if "image/" in msg.mime %}
<a href="{{ msg.data }}">
<img src="{{ msg.thumb if msg.thumb is not none else msg.data }}" {{ 'class="sticker"' | safe if msg.sticker }} loading="lazy"/>
</a>
{% elif "audio/" in msg.mime %}
<audio controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</audio>
{% elif "video/" in msg.mime %}
<video class="lazy" autobuffer {% if msg.message_type|int == 13 or msg.message_type|int == 11 %}autoplay muted loop playsinline{%else%}controls{% endif %}>
<source type="{{ msg.mime }}" data-src="{{ msg.data }}" />
</video>
{% elif "/" in msg.mime %}
The file cannot be displayed here, however it should be located at <a href="./{{ msg.data }}">here</a>
{% else %}
{% filter escape %}{{ msg.data }}{% endfilter %}
{% endif %}
{% if msg.caption is not none %}
<p class='mt-1 {% if "audio/" in msg.mime %}text-[#808080]{% endif %}'>
{{ msg.caption | urlize(none, true, '_blank') }}
</p>
{% endif %}
{% endif %}
{% endif %}
</p>
<p class="text-[10px] text-[#667781] text-right mt-1">{{ msg.time }}
<span class="status-indicator{% if msg.read_timestamp %} read{% endif %}">
{% if msg.received_timestamp %}
✓✓
{% else %}
{% endif %}
</span>
</p>
{% if msg.reactions %}
<div class="flex flex-wrap gap-1 mt-1 justify-end absolute -bottom-3 -right-2">
{% for sender, emoji in msg.reactions.items() %}
<div class="bg-white rounded-full px-1.5 py-0.5 text-xs shadow-sm border border-gray-200 cursor-help" title="{{ sender }}">
{{ emoji }}
</div>
{% endfor %}
</div>
{% endif %}
</div>
</div>
{% else %}
<div class="flex justify-start items-center group" id="{{ msg.key_id }}">
<div class="bg-white rounded-lg p-2 max-w-[80%] shadow-sm relative {% if msg.reactions %}mb-2{% endif %}">
{% if msg.reply is not none %}
<a href="#{{msg.reply}}" target="_self" class="no-base">
<div
class="mb-2 p-1 bg-whatsapp-chat-light rounded border-l-4 border-whatsapp text-sm reply-box">
<div class="flex items-center gap-2">
<div class="flex-1 overflow-hidden">
<p class="text-whatsapp font-medium text-xs">Replying to</p>
<p class="text-[#808080] text-xs truncate">
{% if msg.quoted_data is not none %}
{{msg.quoted_data}}
{% else %}
this message
{% endif %}
</p>
</div>
{% set replied_msg = msgs | selectattr('key_id', 'equalto', msg.reply) | first %}
{% if replied_msg and replied_msg.media == true %}
<div class="flex-shrink-0">
{% if "image/" in replied_msg.mime %}
<img src="{{ replied_msg.thumb if replied_msg.thumb is not none else replied_msg.data }}"
class="w-8 h-8 rounded object-cover" loading="lazy" />
{% elif "video/" in replied_msg.mime %}
<div class="relative w-8 h-8 rounded overflow-hidden bg-gray-200">
<img src="{{ replied_msg.thumb if replied_msg.thumb is not none else replied_msg.data }}"
class="w-full h-full object-cover" loading="lazy" />
<div class="absolute inset-0 flex items-center justify-center">
<div class="play-icon"></div>
</div>
</div>
{% elif "audio/" in replied_msg.mime %}
<div class="w-8 h-8 rounded bg-gray-200 flex items-center justify-center">
<div class="speaker-icon"></div>
</div>
{% endif %}
</div>
{% endif %}
</div>
</div>
</a>
{% endif %}
<p class="text-[#111b21] text-sm">
{% if msg.meta == true or msg.media == false and msg.data is none %}
<div class="flex justify-center mb-2">
<div class="bg-[#FFF3C5] rounded-lg px-3 py-2 text-sm text-[#856404] flex items-center">
{% if msg.safe %}
{{ msg.data | safe or 'Not supported WhatsApp internal message' }}
{% else %}
{{ msg.data or 'Not supported WhatsApp internal message' }}
{% endif %}
</div>
</div>
{% if msg.caption is not none %}
<p>{{ msg.caption | urlize(none, true, '_blank') }}</p>
{% endif %}
{% else %}
{% if msg.media == false %}
{{ msg.data | sanitize_except() | urlize(none, true, '_blank') }}
{% else %}
{% if "image/" in msg.mime %}
<a href="{{ msg.data }}">
<img src="{{ msg.thumb if msg.thumb is not none else msg.data }}" {{ 'class="sticker"' | safe if msg.sticker }} loading="lazy"/>
</a>
{% elif "audio/" in msg.mime %}
<audio controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</audio>
{% elif "video/" in msg.mime %}
<video class="lazy" autobuffer {% if msg.message_type|int == 13 or msg.message_type|int == 11 %}autoplay muted loop playsinline{%else%}controls{% endif %}>
<source type="{{ msg.mime }}" data-src="{{ msg.data }}" />
</video>
{% elif "/" in msg.mime %}
The file cannot be displayed here, however it should be located at <a href="./{{ msg.data }}">here</a>
{% else %}
{% filter escape %}{{ msg.data }}{% endfilter %}
{% endif %}
{% if msg.caption is not none %}
<p class='mt-1 {% if "audio/" in msg.mime %}text-[#808080]{% endif %}'>
{{ msg.caption | urlize(none, true, '_blank') }}
</p>
{% endif %}
{% endif %}
{% endif %}
</p>
<div class="flex items-baseline text-[10px] text-[#667781] mt-1 gap-2">
<span class="flex-shrink-0">
{% if msg.sender is not none %}
{{ msg.sender }}
{% endif %}
</span>
<span class="flex-grow min-w-[4px]"></span>
<span class="flex-shrink-0">{{ msg.time }}</span>
</div>
{% if msg.reactions %}
<div class="flex flex-wrap gap-1 mt-1 justify-start absolute -bottom-3 -left-2">
{% for sender, emoji in msg.reactions.items() %}
<div class="bg-gray-100 rounded-full px-1.5 py-0.5 text-xs shadow-sm border border-gray-200 cursor-help" title="{{ sender }}">
{{ emoji }}
</div>
{% endfor %}
</div>
{% endif %}
</div>
<!-- <div class="opacity-0 group-hover:opacity-100 transition-opacity duration-200 relative ml-2">
<div class="relative">
<div class="relative group/tooltip">
<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5 text-[#8696a0] hover:text-[#54656f] cursor-pointer" fill="none" viewBox="0 0 24 24" stroke="currentColor">
<use href="#info-icon"></use>
</svg>
<div class="absolute bottom-full info-box-tooltip mb-2 hidden group-hover/tooltip:block z-50">
<div class="bg-black text-white text-xs rounded py-1 px-2 whitespace-nowrap">
Received at {{msg.received_timestamp or 'unknown'}}
</div>
<div class="absolute top-full right-3 ml-1 border-4 border-transparent border-t-black"></div>
</div>
</div>
</div>
</div> -->
</div>
{% endif %}
{% endfor %}
</div>
<footer>
{% if not next %}
<div class="flex justify-center mb-6">
<div class="bg-[#e1f2fb] rounded-lg px-3 py-2 text-sm text-[#54656f]">
End of History
</div>
</div>
{% endif %}
<br>
Portions of this page are reproduced from <a href="https://web.dev/articles/lazy-loading-video">work</a>
created and <a href="https://developers.google.com/readme/policies">shared by Google</a> and used
according to terms described in the <a href="https://www.apache.org/licenses/LICENSE-2.0">Apache 2.0
License</a>.
</footer>
</div>
</article>
</body>
<script>
// Search functionality
const searchButton = document.getElementById('searchButton');
const mainSearchInput = document.getElementById('mainSearchInput');
const closeMainSearch = document.getElementById('closeMainSearch');
const mainHeaderSearchInput = document.getElementById('mainHeaderSearchInput');
// Function to show search input
const showSearch = () => {
mainSearchInput.classList.add('active');
mainHeaderSearchInput.focus();
};
// Function to hide search input
const hideSearch = () => {
mainSearchInput.classList.remove('active');
mainHeaderSearchInput.value = '';
};
// Event listeners
searchButton.addEventListener('click', showSearch);
closeMainSearch.addEventListener('click', hideSearch);
// Handle ESC key
document.addEventListener('keydown', (event) => {
if (event.key === 'Escape' && mainSearchInput.classList.contains('active')) {
hideSearch();
}
});
</script>
<script>
document.addEventListener("DOMContentLoaded", function() {
var lazyVideos = [].slice.call(document.querySelectorAll("video.lazy"));
if ("IntersectionObserver" in window) {
var lazyVideoObserver = new IntersectionObserver(function(entries, observer) {
entries.forEach(function(video) {
if (video.isIntersecting) {
for (var source in video.target.children) {
var videoSource = video.target.children[source];
if (typeof videoSource.tagName === "string" && videoSource.tagName === "SOURCE") {
videoSource.src = videoSource.dataset.src;
}
}
video.target.load();
video.target.classList.remove("lazy");
lazyVideoObserver.unobserve(video.target);
}
});
});
lazyVideos.forEach(function(lazyVideo) {
lazyVideoObserver.observe(lazyVideo);
});
}
});
</script>
<script>
// Prevent the <base> tag from affecting links with the class "no-base"
document.querySelectorAll('.no-base').forEach(link => {
link.addEventListener('click', function(event) {
const href = this.getAttribute('href');
if (href.startsWith('#')) {
window.location.hash = href;
event.preventDefault();
}
});
});
</script>
</html>

View File

@@ -1,467 +0,0 @@
<!DOCTYPE html>
<html>
<head>
<title>Whatsapp - {{ name }}</title>
<meta charset="UTF-8">
<script src="https://cdn.tailwindcss.com"></script>
<script>
tailwind.config = {
theme: {
extend: {
colors: {
whatsapp: {
light: '#e7ffdb',
DEFAULT: '#25D366',
dark: '#075E54',
chat: '#efeae2',
'chat-light': '#f0f2f5',
}
}
}
}
}
</script>
<style>
body, html {
height: 100%;
margin: 0;
padding: 0;
scroll-behavior: smooth !important;
}
.chat-list {
height: calc(100vh - 120px);
overflow-y: auto;
}
.message-list {
height: calc(100vh - 90px);
overflow-y: auto;
}
@media (max-width: 640px) {
.chat-list, .message-list {
height: calc(100vh - 108px);
}
}
header {
position: fixed;
z-index: 20;
border-bottom: 2px solid #e3e6e7;
font-size: 2em;
font-weight: bolder;
background-color: white;
padding: 20px 0 20px 0;
}
footer {
margin-top: 10px;
border-top: 2px solid #e3e6e7;
padding: 20px 0 20px 0;
}
article {
width:430px;
margin: auto;
z-index:10;
font-size: 15px;
word-wrap: break-word;
}
img, video, audio{
max-width:100%;
box-sizing: border-box;
}
div.reply{
font-size: 13px;
text-decoration: none;
}
div:target::before {
content: '';
display: block;
height: 115px;
margin-top: -115px;
visibility: hidden;
}
div:target {
animation: 3s highlight;
}
.avatar {
border-radius:50%;
overflow:hidden;
max-width: 64px;
max-height: 64px;
}
.name {
color: #3892da;
}
.pad-left-10 {
padding-left: 10px;
}
.pad-right-10 {
padding-right: 10px;
}
.reply_link {
color: #168acc;
}
.blue {
color: #70777a;
}
.sticker {
max-width: 100px !important;
max-height: 100px !important;
}
@keyframes highlight {
from {
background-color: rgba(37, 211, 102, 0.1);
}
to {
background-color: transparent;
}
}
.search-input {
transform: translateY(-100%);
transition: transform 0.3s ease-in-out;
}
.search-input.active {
transform: translateY(0);
}
.reply-box:active {
background-color:rgb(200 202 205 / var(--tw-bg-opacity, 1));
}
.info-box-tooltip {
--tw-translate-x: -50%;
transform: translate(var(--tw-translate-x), var(--tw-translate-y)) rotate(var(--tw-rotate)) skewX(var(--tw-skew-x)) skewY(var(--tw-skew-y)) scaleX(var(--tw-scale-x)) scaleY(var(--tw-scale-y));
}
</style>
<script>
function search(event) {
keywords = document.getElementById("mainHeaderSearchInput").value;
hits = [];
document.querySelectorAll(".message-text").forEach(elem => {
if (elem.innerText.trim().includes(keywords)){
hits.push(elem.parentElement.parentElement.id);
}
})
console.log(hits);
}
</script>
<base href="{{ media_base }}" target="_blank">
</head>
<body>
<article class="h-screen bg-whatsapp-chat-light">
<div class="w-full flex flex-col">
<div class="p-3 bg-whatsapp-dark flex items-center justify-between border-l border-[#d1d7db]">
<div class="flex items-center">
{% if not no_avatar %}
<div class="w3-col m2 l2">
{% if their_avatar is not none %}
<a href="{{ their_avatar }}"><img src="{{ their_avatar_thumb or '' }}" onerror="this.style.display='none'" class="w-10 h-10 rounded-full mr-3" loading="lazy"></a>
{% else %}
<img src="{{ their_avatar_thumb or '' }}" onerror="this.style.display='none'" class="w-10 h-10 rounded-full mr-3" loading="lazy">
{% endif %}
</div>
{% endif %}
<div>
<h2 class="text-white font-medium">{{ headline }}</h2>
{% if status is not none %}<p class="text-[#8696a0] text-xs">{{ status }}</p>{% endif %}
</div>
</div>
<div class="flex space-x-4">
<!-- <button id="searchButton">
<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5 text-[#aebac1]" fill="none" viewBox="0 0 24 24" stroke="currentColor">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M21 21l-6-6m2-5a7 7 0 11-14 0 7 7 0 0114 0z" />
</svg>
</button> -->
<!-- <svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5 text-[#aebac1]" fill="none" viewBox="0 0 24 24" stroke="currentColor">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M15 19l-7-7 7-7" />
</svg> -->
{% if previous %}
<a href="./{{ previous }}" target="_self">
<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5 text-[#aebac1]" fill="none" viewBox="0 0 24 24" stroke="currentColor">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M15 5l-7 7 7 7" />
</svg>
</a>
{% endif %}
{% if next %}
<a href="./{{ next }}" target="_self">
<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5 text-[#aebac1]" fill="none" viewBox="0 0 24 24" stroke="currentColor">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M9 5l7 7-7 7" />
</svg>
</a>
{% endif %}
</div>
<!-- Search Input Overlay -->
<div id="mainSearchInput" class="search-input absolute article top-0 bg-whatsapp-dark p-3 flex items-center space-x-3">
<button id="closeMainSearch" class="text-[#aebac1]">
<svg xmlns="http://www.w3.org/2000/svg" class="h-6 w-6" fill="none" viewBox="0 0 24 24" stroke="currentColor">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M15 19l-7-7 7-7" />
</svg>
</button>
<input type="text" placeholder="Search..." class="flex-1 bg-[#1f2c34] text-white rounded-lg px-3 py-1 focus:outline-none" id="mainHeaderSearchInput" onkeyup="search(event)">
</div>
</div>
</div>
<div class="flex-1 p-5 message-list">
<div class="flex flex-col space-y-2">
<!--Date-->
{% set last = {'last': 946688461.001} %}
{% for msg in msgs -%}
{% if determine_day(last.last, msg.timestamp) is not none %}
<div class="flex justify-center">
<div class="bg-[#e1f2fb] rounded-lg px-2 py-1 text-xs text-[#54656f]">
{{ determine_day(last.last, msg.timestamp) }}
</div>
</div>
{% if last.update({'last': msg.timestamp}) %}{% endif %}
{% endif %}
<!--Actual messages-->
{% if msg.from_me == true %}
<div class="flex justify-end items-center group" id="{{ msg.key_id }}">
<div class="opacity-0 group-hover:opacity-100 transition-opacity duration-200 relative mr-2">
<div class="relative">
<div class="relative group/tooltip">
<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5 text-[#8696a0] hover:text-[#54656f] cursor-pointer" fill="none" viewBox="0 0 24 24" stroke="currentColor">
<use href="#info-icon"></use>
</svg>
<div class="absolute bottom-full info-box-tooltip mb-2 hidden group-hover/tooltip:block z-50">
<div class="bg-black text-white text-xs rounded py-1 px-2 whitespace-nowrap">
Delivered at {{msg.received_timestamp or 'unknown'}}
{% if msg.read_timestamp is not none %}
<br>Read at {{ msg.read_timestamp }}
{% endif %}
</div>
<div class="absolute top-full right-3 -mt-1 border-4 border-transparent border-t-black"></div>
</div>
</div>
</div>
</div>
<div class="bg-whatsapp-light rounded-lg p-2 max-w-[80%] shadow-sm">
{% if msg.reply is not none %}
<a href="#{{msg.reply}}" target="_self" class="no-base">
<div class="mb-2 p-1 bg-whatsapp-chat-light rounded border-l-4 border-whatsapp text-sm reply-box">
<p class="text-whatsapp font-medium text-xs">Replying to</p>
<p class="text-[#111b21] text-xs truncate">
{% if msg.quoted_data is not none %}
"{{msg.quoted_data}}"
{% else %}
this message
{% endif %}
</p>
</div>
</a>
{% endif %}
<p class="text-[#111b21] text-sm message-text">
{% if msg.meta == true or msg.media == false and msg.data is none %}
<div class="flex justify-center mb-2">
<div class="bg-[#FFF3C5] rounded-lg px-3 py-2 text-sm text-[#856404] flex items-center">
{% if msg.safe %}
{{ msg.data | safe or 'Not supported WhatsApp internal message' }}
{% else %}
{{ msg.data or 'Not supported WhatsApp internal message' }}
{% endif %}
</div>
</div>
{% if msg.caption is not none %}
<p>{{ msg.caption | urlize(none, true, '_blank') }}</p>
{% endif %}
{% else %}
{% if msg.media == false %}
{{ msg.data | sanitize_except() | urlize(none, true, '_blank') }}
{% else %}
{% if "image/" in msg.mime %}
<a href="{{ msg.data }}">
<img src="{{ msg.thumb if msg.thumb is not none else msg.data }}" {{ 'class="sticker"' | safe if msg.sticker }} loading="lazy"/>
</a>
{% elif "audio/" in msg.mime %}
<audio controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</audio>
{% elif "video/" in msg.mime %}
<video class="lazy" autobuffer {% if msg.message_type|int == 13 or msg.message_type|int == 11 %}autoplay muted loop playsinline{%else%}controls{% endif %}>
<source type="{{ msg.mime }}" data-src="{{ msg.data }}" />
</video>
{% elif "/" in msg.mime %}
The file cannot be displayed here, however it should be located at <a href="./{{ msg.data }}">here</a>
{% else %}
{% filter escape %}{{ msg.data }}{% endfilter %}
{% endif %}
{% if msg.caption is not none %}
{{ msg.caption | urlize(none, true, '_blank') }}
{% endif %}
{% endif %}
{% endif %}
</p>
<p class="text-[10px] text-[#667781] text-right mt-1">{{ msg.time }}</p>
</div>
</div>
{% else %}
<div class="flex justify-start items-center group" id="{{ msg.key_id }}">
<div class="bg-white rounded-lg p-2 max-w-[80%] shadow-sm">
{% if msg.reply is not none %}
<a href="#{{msg.reply}}" target="_self" class="no-base">
<div class="mb-2 p-1 bg-whatsapp-chat-light rounded border-l-4 border-whatsapp text-sm reply-box">
<p class="text-whatsapp font-medium text-xs">Replying to</p>
<p class="text-[#808080] text-xs truncate">
{% if msg.quoted_data is not none %}
{{msg.quoted_data}}
{% else %}
this message
{% endif %}
</p>
</div>
</a>
{% endif %}
<p class="text-[#111b21] text-sm">
{% if msg.meta == true or msg.media == false and msg.data is none %}
<div class="flex justify-center mb-2">
<div class="bg-[#FFF3C5] rounded-lg px-3 py-2 text-sm text-[#856404] flex items-center">
{% if msg.safe %}
{{ msg.data | safe or 'Not supported WhatsApp internal message' }}
{% else %}
{{ msg.data or 'Not supported WhatsApp internal message' }}
{% endif %}
</div>
</div>
{% if msg.caption is not none %}
<p>{{ msg.caption | urlize(none, true, '_blank') }}</p>
{% endif %}
{% else %}
{% if msg.media == false %}
{{ msg.data | sanitize_except() | urlize(none, true, '_blank') }}
{% else %}
{% if "image/" in msg.mime %}
<a href="{{ msg.data }}">
<img src="{{ msg.thumb if msg.thumb is not none else msg.data }}" {{ 'class="sticker"' | safe if msg.sticker }} loading="lazy"/>
</a>
{% elif "audio/" in msg.mime %}
<audio controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</audio>
{% elif "video/" in msg.mime %}
<video class="lazy" autobuffer {% if msg.message_type|int == 13 or msg.message_type|int == 11 %}autoplay muted loop playsinline{%else%}controls{% endif %}>
<source type="{{ msg.mime }}" data-src="{{ msg.data }}" />
</video>
{% elif "/" in msg.mime %}
The file cannot be displayed here, however it should be located at <a href="./{{ msg.data }}">here</a>
{% else %}
{% filter escape %}{{ msg.data }}{% endfilter %}
{% endif %}
{% if msg.caption is not none %}
{{ msg.caption | urlize(none, true, '_blank') }}
{% endif %}
{% endif %}
{% endif %}
</p>
<div class="flex items-baseline text-[10px] text-[#667781] mt-1 gap-2">
<span class="flex-shrink-0">
{% if msg.sender is not none %}
{{ msg.sender }}
{% endif %}
</span>
<span class="flex-grow min-w-[4px]"></span>
<span class="flex-shrink-0">{{ msg.time }}</span>
</div>
</div>
<!-- <div class="opacity-0 group-hover:opacity-100 transition-opacity duration-200 relative ml-2">
<div class="relative">
<div class="relative group/tooltip">
<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5 text-[#8696a0] hover:text-[#54656f] cursor-pointer" fill="none" viewBox="0 0 24 24" stroke="currentColor">
<use href="#info-icon"></use>
</svg>
<div class="absolute bottom-full info-box-tooltip mb-2 hidden group-hover/tooltip:block z-50">
<div class="bg-black text-white text-xs rounded py-1 px-2 whitespace-nowrap">
Received at {{msg.received_timestamp or 'unknown'}}
</div>
<div class="absolute top-full right-3 ml-1 border-4 border-transparent border-t-black"></div>
</div>
</div>
</div>
</div> -->
</div>
{% endif %}
{% endfor %}
</div>
<footer>
<h2 class="text-center">
{% if not next %}
End of History
{% endif %}
</h2>
<br>
Portions of this page are reproduced from <a href="https://web.dev/articles/lazy-loading-video">work</a> created and <a href="https://developers.google.com/readme/policies">shared by Google</a> and used according to terms described in the <a href="https://www.apache.org/licenses/LICENSE-2.0">Apache 2.0 License</a>.
</footer>
<svg style="display: none;">
<!-- Tooltip info icon -->
<symbol id="info-icon" viewBox="0 0 24 24">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M13 16h-1v-4h-1m1-4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
</symbol>
</svg>
</div>
</article>
</body>
<script>
// Search functionality
const searchButton = document.getElementById('searchButton');
const mainSearchInput = document.getElementById('mainSearchInput');
const closeMainSearch = document.getElementById('closeMainSearch');
const mainHeaderSearchInput = document.getElementById('mainHeaderSearchInput');
// Function to show search input
const showSearch = () => {
mainSearchInput.classList.add('active');
mainHeaderSearchInput.focus();
};
// Function to hide search input
const hideSearch = () => {
mainSearchInput.classList.remove('active');
mainHeaderSearchInput.value = '';
};
// Event listeners
searchButton.addEventListener('click', showSearch);
closeMainSearch.addEventListener('click', hideSearch);
// Handle ESC key
document.addEventListener('keydown', (event) => {
if (event.key === 'Escape' && mainSearchInput.classList.contains('active')) {
hideSearch();
}
});
</script>
<script>
document.addEventListener("DOMContentLoaded", function() {
var lazyVideos = [].slice.call(document.querySelectorAll("video.lazy"));
if ("IntersectionObserver" in window) {
var lazyVideoObserver = new IntersectionObserver(function(entries, observer) {
entries.forEach(function(video) {
if (video.isIntersecting) {
for (var source in video.target.children) {
var videoSource = video.target.children[source];
if (typeof videoSource.tagName === "string" && videoSource.tagName === "SOURCE") {
videoSource.src = videoSource.dataset.src;
}
}
video.target.load();
video.target.classList.remove("lazy");
lazyVideoObserver.unobserve(video.target);
}
});
});
lazyVideos.forEach(function(lazyVideo) {
lazyVideoObserver.observe(lazyVideo);
});
}
});
</script>
<script>
// Prevent the <base> tag from affecting links with the class "no-base"
document.querySelectorAll('.no-base').forEach(link => {
link.addEventListener('click', function(event) {
const href = this.getAttribute('href');
if (href.startsWith('#')) {
window.location.hash = href;
event.preventDefault();
}
});
});
</script>
</html>

View File

@@ -0,0 +1,329 @@
<!DOCTYPE html>
<html>
<head>
<title>Whatsapp - {{ name }}</title>
<meta charset="UTF-8">
<link rel="stylesheet" href="{{w3css}}">
<style>
html, body {
font-size: 12px;
scroll-behavior: smooth;
}
header {
position: fixed;
z-index: 20;
border-bottom: 2px solid #e3e6e7;
font-size: 2em;
font-weight: bolder;
background-color: white;
padding: 20px 0 20px 0;
}
footer {
border-top: 2px solid #e3e6e7;
padding: 20px 0 20px 0;
}
article {
width:500px;
margin:100px auto;
z-index:10;
font-size: 15px;
word-wrap: break-word;
}
img, video {
max-width:100%;
}
div.reply{
font-size: 13px;
text-decoration: none;
}
div:target::before {
content: '';
display: block;
height: 115px;
margin-top: -115px;
visibility: hidden;
}
div:target {
border-style: solid;
border-width: 2px;
animation: border-blink 0.5s steps(1) 5;
border-color: rgba(0,0,0,0)
}
table {
width: 100%;
}
@keyframes border-blink {
0% {
border-color: #2196F3;
}
50% {
border-color: rgba(0,0,0,0);
}
}
.avatar {
border-radius:50%;
overflow:hidden;
max-width: 64px;
max-height: 64px;
}
.name {
color: #3892da;
}
.pad-left-10 {
padding-left: 10px;
}
.pad-right-10 {
padding-right: 10px;
}
.reply_link {
color: #168acc;
}
.blue {
color: #70777a;
}
.sticker {
max-width: 100px !important;
max-height: 100px !important;
}
</style>
<base href="{{ media_base }}" target="_blank">
</head>
<body>
<header class="w3-center w3-top">
{{ headline }}
{% if status is not none %}
<br>
<span class="w3-small">{{ status }}</span>
{% endif %}
</header>
<article class="w3-container">
<div class="table">
{% set last = {'last': 946688461.001} %}
{% for msg in msgs -%}
<div class="w3-row w3-padding-small w3-margin-bottom" id="{{ msg.key_id }}">
{% if determine_day(last.last, msg.timestamp) is not none %}
<div class="w3-center w3-padding-16 blue">{{ determine_day(last.last, msg.timestamp) }}</div>
{% if last.update({'last': msg.timestamp}) %}{% endif %}
{% endif %}
{% if msg.from_me == true %}
<div class="w3-row">
<div class="w3-left blue">{{ msg.time }}</div>
<div class="name w3-right-align pad-left-10">You</div>
</div>
<div class="w3-row">
{% if not no_avatar and my_avatar is not none %}
<div class="w3-col m10 l10">
{% else %}
<div class="w3-col m12 l12">
{% endif %}
<div class="w3-right-align">
{% if msg.reply is not none %}
<div class="reply">
<span class="blue">Replying to </span>
<a href="#{{msg.reply}}" target="_self" class="reply_link no-base">
{% if msg.quoted_data is not none %}
"{{msg.quoted_data}}"
{% else %}
this message
{% endif %}
</a>
</div>
{% endif %}
{% if msg.meta == true or msg.media == false and msg.data is none %}
<div class="w3-panel w3-border-blue w3-pale-blue w3-rightbar w3-leftbar w3-threequarter w3-center">
{% if msg.safe %}
<p>{{ msg.data | safe or 'Not supported WhatsApp internal message' }}</p>
{% else %}
<p>{{ msg.data or 'Not supported WhatsApp internal message' }}</p>
{% endif %}
</div>
{% if msg.caption is not none %}
<div class="w3-container">
{{ msg.caption | urlize(none, true, '_blank') }}
</div>
{% endif %}
{% else %}
{% if msg.media == false %}
{{ msg.data | sanitize_except() | urlize(none, true, '_blank') }}
{% else %}
{% if "image/" in msg.mime %}
<a href="{{ msg.data }}">
<img src="{{ msg.thumb if msg.thumb is not none else msg.data }}" {{ 'class="sticker"' | safe if msg.sticker }} loading="lazy"/>
</a>
{% elif "audio/" in msg.mime %}
<audio controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</audio>
{% elif "video/" in msg.mime %}
<video class="lazy" autobuffer {% if msg.message_type|int == 13 or msg.message_type|int == 11 %}autoplay muted loop playsinline{%else%}controls{% endif %}>
<source type="{{ msg.mime }}" data-src="{{ msg.data }}" />
</video>
{% elif "/" in msg.mime %}
<div class="w3-panel w3-border-blue w3-pale-blue w3-rightbar w3-leftbar w3-threequarter w3-center">
<p>The file cannot be displayed here, however it should be located at <a href="./{{ msg.data }}">here</a></p>
</div>
{% else %}
{% filter escape %}{{ msg.data }}{% endfilter %}
{% endif %}
{% if msg.caption is not none %}
<div class="w3-container">
{{ msg.caption | urlize(none, true, '_blank') }}
</div>
{% endif %}
{% endif %}
{% endif %}
</div>
</div>
{% if not no_avatar and my_avatar is not none %}
<div class="w3-col m2 l2 pad-left-10">
<a href="{{ my_avatar }}">
<img src="{{ my_avatar }}" onerror="this.style.display='none'" class="avatar" loading="lazy">
</a>
</div>
{% endif %}
</div>
{% else %}
<div class="w3-row">
<div class="w3-left pad-right-10 name">
{% if msg.sender is not none %}
{{ msg.sender }}
{% else %}
{{ name }}
{% endif %}
</div>
<div class="w3-right-align blue">{{ msg.time }}</div>
</div>
<div class="w3-row">
{% if not no_avatar %}
<div class="w3-col m2 l2">
{% if their_avatar is not none %}
<a href="{{ their_avatar }}"><img src="{{ their_avatar_thumb or '' }}" onerror="this.style.display='none'" class="avatar" loading="lazy"></a>
{% else %}
<img src="{{ their_avatar_thumb or '' }}" onerror="this.style.display='none'" class="avatar" loading="lazy">
{% endif %}
</div>
<div class="w3-col m10 l10">
{% else %}
<div class="w3-col m12 l12">
{% endif %}
<div class="w3-left-align">
{% if msg.reply is not none %}
<div class="reply">
<span class="blue">Replying to </span>
<a href="#{{msg.reply}}" target="_self" class="reply_link no-base">
{% if msg.quoted_data is not none %}
"{{msg.quoted_data}}"
{% else %}
this message
{% endif %}
</a>
</div>
{% endif %}
{% if msg.meta == true or msg.media == false and msg.data is none %}
<div class="w3-panel w3-border-blue w3-pale-blue w3-rightbar w3-leftbar w3-threequarter w3-center">
{% if msg.safe %}
<p>{{ msg.data | safe or 'Not supported WhatsApp internal message' }}</p>
{% else %}
<p>{{ msg.data or 'Not supported WhatsApp internal message' }}</p>
{% endif %}
</div>
{% if msg.caption is not none %}
<div class="w3-container">
{{ msg.caption | urlize(none, true, '_blank') }}
</div>
{% endif %}
{% else %}
{% if msg.media == false %}
{{ msg.data | sanitize_except() | urlize(none, true, '_blank') }}
{% else %}
{% if "image/" in msg.mime %}
<a href="{{ msg.data }}">
<img src="{{ msg.thumb if msg.thumb is not none else msg.data }}" {{ 'class="sticker"' | safe if msg.sticker }} loading="lazy"/>
</a>
{% elif "audio/" in msg.mime %}
<audio controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</audio>
{% elif "video/" in msg.mime %}
<video class="lazy" autobuffer {% if msg.message_type|int == 13 or msg.message_type|int == 11 %}autoplay muted loop playsinline{%else%}controls{% endif %}>
<source type="{{ msg.mime }}" data-src="{{ msg.data }}" />
</video>
{% elif "/" in msg.mime %}
<div class="w3-panel w3-border-blue w3-pale-blue w3-rightbar w3-leftbar w3-threequarter w3-center">
<p>The file cannot be displayed here, however it should be located at <a href="./{{ msg.data }}">here</a></p>
</div>
{% else %}
{% filter escape %}{{ msg.data }}{% endfilter %}
{% endif %}
{% if msg.caption is not none %}
<div class="w3-container">
{{ msg.caption | urlize(none, true, '_blank') }}
</div>
{% endif %}
{% endif %}
{% endif %}
</div>
</div>
</div>
{% endif %}
</div>
{% endfor %}
</div>
</article>
<footer class="w3-center">
<h2>
{% if previous %}
<a href="./{{ previous }}" target="_self">Previous</a>
{% endif %}
<h2>
{% if next %}
<a href="./{{ next }}" target="_self">Next</a>
{% else %}
End of History
{% endif %}
</h2>
<br>
Portions of this page are reproduced from <a href="https://web.dev/articles/lazy-loading-video">work</a> created and <a href="https://developers.google.com/readme/policies">shared by Google</a> and used according to terms described in the <a href="https://www.apache.org/licenses/LICENSE-2.0">Apache 2.0 License</a>.
</footer>
<script>
document.addEventListener("DOMContentLoaded", function() {
var lazyVideos = [].slice.call(document.querySelectorAll("video.lazy"));
if ("IntersectionObserver" in window) {
var lazyVideoObserver = new IntersectionObserver(function(entries, observer) {
entries.forEach(function(video) {
if (video.isIntersecting) {
for (var source in video.target.children) {
var videoSource = video.target.children[source];
if (typeof videoSource.tagName === "string" && videoSource.tagName === "SOURCE") {
videoSource.src = videoSource.dataset.src;
}
}
video.target.load();
video.target.classList.remove("lazy");
lazyVideoObserver.unobserve(video.target);
}
});
});
lazyVideos.forEach(function(lazyVideo) {
lazyVideoObserver.observe(lazyVideo);
});
}
});
</script>
<script>
// Prevent the <base> tag from affecting links with the class "no-base"
document.querySelectorAll('.no-base').forEach(link => {
link.addEventListener('click', function(event) {
const href = this.getAttribute('href');
if (href.startsWith('#')) {
window.location.hash = href;
event.preventDefault();
}
});
});
</script>
</body>
</html>

Binary file not shown.

Before

Width:  |  Height:  |  Size: 15 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 126 KiB

After

Width:  |  Height:  |  Size: 116 KiB

View File

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "whatsapp-chat-exporter"
version = "0.12.0"
version = "0.13.0"
description = "A Whatsapp database parser that provides history of your Whatsapp conversations in HTML and JSON. Android, iOS, iPadOS, Crypt12, Crypt14, Crypt15 supported."
readme = "README.md"
authors = [
@@ -19,10 +19,11 @@ keywords = [
]
classifiers = [
"Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Programming Language :: Python :: 3.14",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
"Development Status :: 4 - Beta",
@@ -32,10 +33,11 @@ classifiers = [
"Topic :: Utilities",
"Topic :: Database"
]
requires-python = ">=3.9"
requires-python = ">=3.10"
dependencies = [
"jinja2",
"bleach"
"bleach",
"tqdm"
]
[project.optional-dependencies]
@@ -43,10 +45,9 @@ android_backup = ["pycryptodome", "javaobj-py3"]
crypt12 = ["pycryptodome"]
crypt14 = ["pycryptodome"]
crypt15 = ["pycryptodome", "javaobj-py3"]
all = ["pycryptodome", "javaobj-py3", "vobject"]
everything = ["pycryptodome", "javaobj-py3", "vobject"]
all = ["pycryptodome", "javaobj-py3"]
everything = ["pycryptodome", "javaobj-py3"]
backup = ["pycryptodome", "javaobj-py3"]
vcards = ["vobject", "pycryptodome", "javaobj-py3"]
[project.scripts]
wtsexporter = "Whatsapp_Chat_Exporter.__main__:main"
@@ -59,3 +60,8 @@ include = ["Whatsapp_Chat_Exporter"]
[tool.setuptools.package-data]
Whatsapp_Chat_Exporter = ["*.html"]
[dependency-groups]
dev = [
"pytest>=8.3.5",
]

View File

@@ -6,19 +6,20 @@ Contributed by @magpires https://github.com/KnugiHK/WhatsApp-Chat-Exporter/issue
import re
import argparse
def process_phone_number(raw_phone):
"""
Process the raw phone string from the VCARD and return two formatted numbers:
- The original formatted number, and
- A modified formatted number with the extra (ninth) digit removed, if applicable.
Desired output:
For a number with a 9-digit subscriber:
Original: "+55 {area} {first 5 of subscriber}-{last 4 of subscriber}"
Modified: "+55 {area} {subscriber[1:5]}-{subscriber[5:]}"
For example, for an input that should represent "027912345678", the outputs are:
"+55 27 91234-5678" and "+55 27 1234-5678"
This function handles numbers that may already include a "+55" prefix.
It expects that after cleaning, a valid number (without the country code) should have either 10 digits
(2 for area + 8 for subscriber) or 11 digits (2 for area + 9 for subscriber).
@@ -26,18 +27,18 @@ def process_phone_number(raw_phone):
"""
# Store the original input for processing
number_to_process = raw_phone.strip()
# Remove all non-digit characters
digits = re.sub(r'\D', '', number_to_process)
# If the number starts with '55', remove it for processing
if digits.startswith("55") and len(digits) > 11:
digits = digits[2:]
# Remove trunk zero if present
if digits.startswith("0"):
digits = digits[1:]
# After cleaning, we expect a valid number to have either 10 or 11 digits
# If there are extra digits, use the last 11 (for a 9-digit subscriber) or last 10 (for an 8-digit subscriber)
if len(digits) > 11:
@@ -46,7 +47,7 @@ def process_phone_number(raw_phone):
elif len(digits) > 10 and len(digits) < 11:
# In some cases with an 8-digit subscriber, take the last 10 digits
digits = digits[-10:]
# Check if we have a valid number after processing
if len(digits) not in (10, 11):
return None, None
@@ -70,6 +71,7 @@ def process_phone_number(raw_phone):
return original_formatted, modified_formatted
def process_vcard(input_vcard, output_vcard):
"""
Process a VCARD file to standardize telephone entries and add a second TEL line
@@ -77,13 +79,13 @@ def process_vcard(input_vcard, output_vcard):
"""
with open(input_vcard, 'r', encoding='utf-8') as file:
lines = file.readlines()
output_lines = []
# Regex to capture any telephone line.
# It matches lines starting with "TEL:" or "TEL;TYPE=..." or with prefixes like "item1.TEL:".
phone_pattern = re.compile(r'^(?P<prefix>.*TEL(?:;TYPE=[^:]+)?):(?P<number>.*)$')
for line in lines:
stripped_line = line.rstrip("\n")
match = phone_pattern.match(stripped_line)
@@ -99,10 +101,11 @@ def process_vcard(input_vcard, output_vcard):
output_lines.append(f"TEL;TYPE=CELL:{mod_formatted}\n")
else:
output_lines.append(line)
with open(output_vcard, 'w', encoding='utf-8') as file:
file.writelines(output_lines)
if __name__ == '__main__':
parser = argparse.ArgumentParser(
description="Process a VCARD file to standardize telephone entries and add a second TEL line with the modified number (removing the extra ninth digit) for contacts with 9-digit subscribers."
@@ -110,6 +113,6 @@ if __name__ == '__main__':
parser.add_argument('input_vcard', type=str, help='Input VCARD file')
parser.add_argument('output_vcard', type=str, help='Output VCARD file')
args = parser.parse_args()
process_vcard(args.input_vcard, args.output_vcard)
print(f"VCARD processed and saved to {args.output_vcard}")
print(f"VCARD processed and saved to {args.output_vcard}")

View File

@@ -27,23 +27,24 @@ def _extract_encrypted_key(keyfile):
return _generate_hmac_of_hmac(key_stream)
key = open("encrypted_backup.key", "rb").read()
database = open("wa.db.crypt15", "rb").read()
main_key, hex_key = _extract_encrypted_key(key)
for i in range(100):
iv = database[i:i+16]
for j in range(100):
cipher = AES.new(main_key, AES.MODE_GCM, iv)
db_ciphertext = database[j:]
db_compressed = cipher.decrypt(db_ciphertext)
try:
db = zlib.decompress(db_compressed)
except zlib.error:
...
else:
if db[0:6] == b"SQLite":
print(f"Found!\nIV: {i}\nOffset: {j}")
print(db_compressed[:10])
exit()
if __name__ == "__main__":
key = open("encrypted_backup.key", "rb").read()
database = open("wa.db.crypt15", "rb").read()
main_key, hex_key = _extract_encrypted_key(key)
for i in range(100):
iv = database[i:i+16]
for j in range(100):
cipher = AES.new(main_key, AES.MODE_GCM, iv)
db_ciphertext = database[j:]
db_compressed = cipher.decrypt(db_ciphertext)
try:
db = zlib.decompress(db_compressed)
except zlib.error:
...
else:
if db[0:6] == b"SQLite":
print(f"Found!\nIV: {i}\nOffset: {j}")
print(db_compressed[:10])
exit()
print("Not found! Try to increase maximum search.")
print("Not found! Try to increase maximum search.")

0
tests/__init__.py Normal file
View File

27
tests/conftest.py Normal file
View File

@@ -0,0 +1,27 @@
import pytest
import os
def pytest_collection_modifyitems(config, items):
"""
Moves test_nuitka_binary.py to the end and fails if the file is missing.
"""
target_file = "test_nuitka_binary.py"
# Sanity Check: Ensure the file actually exists in the tests directory
test_dir = os.path.join(config.rootdir, "tests")
file_path = os.path.join(test_dir, target_file)
if not os.path.exists(file_path):
pytest.exit(f"\n[FATAL] Required test file '{target_file}' not found in {test_dir}. "
f"Order enforcement failed!", returncode=1)
nuitka_tests = []
remaining_tests = []
for item in items:
if target_file in item.nodeid:
nuitka_tests.append(item)
else:
remaining_tests.append(item)
items[:] = remaining_tests + nuitka_tests

44
tests/data/contacts.vcf Normal file
View File

@@ -0,0 +1,44 @@
BEGIN:VCARD
VERSION:3.0
FN:Sample Contact
TEL;TYPE=CELL:+85288888888
END:VCARD
BEGIN:VCARD
VERSION:2.1
N:Lopez;Yard Lawn Guy;Jose;;
FN:Yard Lawn Guy, Jose Lopez
TEL;HOME:5673334444
END:VCARD
BEGIN:VCARD
VERSION:2.1
N;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:;=4A=6F=68=6E=20=42=75=74=6C=65=72=20=F0=9F=8C=9F=
=F0=9F=92=AB=F0=9F=8C=9F;;;
FN;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=4A=6F=68=6E=20=42=75=74=6C=65=72=20=F0=9F=8C=9F=
=F0=9F=92=AB=F0=9F=8C=9F
TEL;PREF:5556667777
END:VCARD
BEGIN:VCARD
VERSION:2.1
TEL;WORK;PREF:1234567890
ORG:Airline Contact #'s
NOTE;ENCODING=QUOTED-PRINTABLE:=53=70=69=72=69=74=20=41=69=72=6C=69=
=6E=65=73=20=38=30=30=2D=37=37=32=2D=37=31=31=37=55=6E=69=74=65=64=
=20=41=69=72=6C=69=6E=65=73=20=38=30=30=2D=32=34=31=2D=36=35=32=32
END:VCARD
BEGIN:VCARD
VERSION:2.1
TEL;WORK;PREF:3451112222
X-SAMSUNGADR;ENCODING=QUOTED-PRINTABLE:;;=31=31=31=31=32=20=4E=6F=72=74=68=20=45=6C=64=72=
=69=64=67=65=20=50=61=72=6B=77=61=79;=44=61=6C=6C=61=73;=54=58;=32=32=32=32=32
ORG:James Peacock Elementary
END:VCARD
BEGIN:VCARD
VERSION:2.1
TEL;CELL:8889990001
ORG:AAA Car Service
END:VCARD

View File

@@ -4,13 +4,14 @@ import tempfile
import os
from unittest.mock import patch
from brazilian_number_processing import process_phone_number, process_vcard
from scripts.brazilian_number_processing import process_phone_number, process_vcard
class TestVCardProcessor(unittest.TestCase):
def test_process_phone_number(self):
"""Test the process_phone_number function with various inputs."""
# Test cases for 9-digit subscriber numbers
test_cases_9_digit = [
# Standard 11-digit number (2 area + 9 subscriber)
@@ -30,7 +31,7 @@ class TestVCardProcessor(unittest.TestCase):
# With extra non-digit characters
("+55-27-9.1234_5678", "+55 27 91234-5678", "+55 27 1234-5678"),
]
# Test cases for 8-digit subscriber numbers
test_cases_8_digit = [
# Standard 10-digit number (2 area + 8 subscriber)
@@ -46,7 +47,7 @@ class TestVCardProcessor(unittest.TestCase):
# With country code and trunk zero
("+55 0 27 1234-5678", "+55 27 1234-5678", None),
]
# Edge cases
edge_cases = [
# Too few digits
@@ -60,19 +61,19 @@ class TestVCardProcessor(unittest.TestCase):
# Unusual formatting but valid number
("(+55) [27] 9.1234_5678", "+55 27 91234-5678", "+55 27 1234-5678"),
]
# Run tests for all cases
all_cases = test_cases_9_digit + test_cases_8_digit + edge_cases
for raw_phone, expected_orig, expected_mod in all_cases:
with self.subTest(raw_phone=raw_phone):
orig, mod = process_phone_number(raw_phone)
self.assertEqual(orig, expected_orig)
self.assertEqual(mod, expected_mod)
def test_process_vcard(self):
"""Test the process_vcard function with various VCARD formats."""
# Test case 1: Standard TEL entries
vcard1 = """BEGIN:VCARD
VERSION:3.0
@@ -202,26 +203,26 @@ END:VCARD
(vcard5, expected5),
(vcard6, expected6)
]
for i, (input_vcard, expected_output) in enumerate(test_cases):
with self.subTest(case=i+1):
# Create temporary files for input and output
with tempfile.NamedTemporaryFile(mode='w+', delete=False, encoding='utf-8') as input_file:
input_file.write(input_vcard)
input_path = input_file.name
output_path = input_path + '.out'
try:
# Process the VCARD
process_vcard(input_path, output_path)
# Read and verify the output
with open(output_path, 'r', encoding='utf-8') as output_file:
actual_output = output_file.read()
self.assertEqual(actual_output, expected_output)
finally:
# Clean up temporary files
if os.path.exists(input_path):
@@ -231,7 +232,7 @@ END:VCARD
def test_script_argument_handling(self):
"""Test the script's command-line argument handling."""
test_input = """BEGIN:VCARD
VERSION:3.0
N:Test;User;;;
@@ -239,16 +240,17 @@ FN:User Test
TEL:+5527912345678
END:VCARD
"""
# Create a temporary input file
with tempfile.NamedTemporaryFile(mode='w+', delete=False, encoding='utf-8') as input_file:
input_file.write(test_input)
input_path = input_file.name
output_path = input_path + '.out'
try:
test_args = ['python' if os.name == 'nt' else 'python3', 'brazilian_number_processing.py', input_path, output_path]
test_args = ['python' if os.name == 'nt' else 'python3',
'scripts/brazilian_number_processing.py', input_path, output_path]
# We're just testing that the argument parsing works
subprocess.call(
test_args,
@@ -257,7 +259,7 @@ END:VCARD
)
# Check if the output file was created
self.assertTrue(os.path.exists(output_path))
finally:
# Clean up temporary files
if os.path.exists(input_path):
@@ -265,5 +267,6 @@ END:VCARD
if os.path.exists(output_path):
os.unlink(output_path)
if __name__ == '__main__':
unittest.main()

50
tests/test_exporter.py Normal file
View File

@@ -0,0 +1,50 @@
import subprocess
import pytest
@pytest.fixture
def command_runner():
"""
A pytest fixture to simplify running commands. This is a helper
function that you can use in multiple tests.
"""
def _run_command(command_list, check=True):
"""
Runs a command and returns the result.
Args:
command_list (list): A list of strings representing the command
and its arguments (e.g., ["python", "my_script.py", "arg1"]).
check (bool, optional): If True, raise an exception if the
command returns a non-zero exit code. Defaults to True.
Returns:
subprocess.CompletedProcess: The result of the command.
"""
return subprocess.run(
command_list,
capture_output=True,
text=True,
check=check,
)
return _run_command
def test_sanity_check(command_runner):
"""
This is a basic sanity check to make sure all modules can be imported
This runs the exporter without any arguments. It should fail with a
message about missing arguments.
"""
result = command_runner(["wtsexporter"], False)
expected_stderr = "You must define the device type"
assert expected_stderr in result.stderr, f"STDERR was: {result.stderr}"
assert result.returncode == 2
def test_android(command_runner):
...
def test_ios(command_runner):
...

View File

@@ -0,0 +1,344 @@
import os
import json
import pytest
from unittest.mock import patch, mock_open, call, MagicMock
from Whatsapp_Chat_Exporter.utility import incremental_merge
from Whatsapp_Chat_Exporter.data_model import ChatStore
# Test data setup
BASE_PATH = "AppDomainGroup-group.net.whatsapp.WhatsApp.shared"
chat_data_1 = {
"12345678@s.whatsapp.net": {
"name": "Friend",
"type": "ios",
"my_avatar": os.path.join(BASE_PATH, "Media", "Profile", "Photo.jpg"),
"their_avatar": os.path.join(BASE_PATH, "Media", "Profile", "12345678-1709851420.thumb"),
"their_avatar_thumb": None,
"status": None,
"messages": {
"24690": {
"from_me": True,
"timestamp": 1463926635.571629,
"time": "10:17",
"media": False,
"key_id": "34B5EF10FBCA37B7E",
"meta": False,
"data": "I'm here",
"safe": False,
"sticker": False
},
"24691": { # This message only exists in target
"from_me": False,
"timestamp": 1463926641.571629,
"time": "10:17",
"media": False,
"key_id": "34B5EF10FBCA37B8E",
"meta": False,
"data": "Great to see you",
"safe": False,
"sticker": False
}
}
}
}
chat_data_2 = {
"12345678@s.whatsapp.net": {
"name": "Friend",
"type": "ios",
"my_avatar": os.path.join(BASE_PATH, "Media", "Profile", "Photo.jpg"),
"their_avatar": os.path.join(BASE_PATH, "Media", "Profile", "12345678-1709851420.thumb"),
"their_avatar_thumb": None,
"status": None,
"messages": {
"24690": {
"from_me": True,
"timestamp": 1463926635.571629,
"time": "10:17",
"media": False,
"key_id": "34B5EF10FBCA37B7E",
"meta": False,
"data": "I'm here",
"safe": False,
"sticker": False
},
"24692": { # This message only exists in source
"from_me": False,
"timestamp": 1463926642.571629,
"time": "10:17",
"media": False,
"key_id": "34B5EF10FBCA37B9E",
"meta": False,
"data": "Hi there!",
"safe": False,
"sticker": False
},
}
}
}
# Expected merged data - should contain all messages with all fields initialized as they would be by Message class
chat_data_merged = {
"12345678@s.whatsapp.net": {
"name": "Friend",
"type": "ios",
"my_avatar": os.path.join(BASE_PATH, "Media", "Profile", "Photo.jpg"),
"their_avatar": os.path.join(BASE_PATH, "Media", "Profile", "12345678-1709851420.thumb"),
"their_avatar_thumb": None,
"status": None,
"media_base": "",
"messages": {
"24690": {
"from_me": True,
"timestamp": 1463926635.571629,
"time": "10:17",
"media": False,
"key_id": "34B5EF10FBCA37B7E",
"meta": False,
"data": "I'm here",
"sender": None,
"safe": False,
"mime": None,
"reply": None,
"quoted_data": None,
'reactions': {},
"caption": None,
"thumb": None,
"sticker": False,
"message_type": None,
"received_timestamp": None,
"read_timestamp": None
},
"24691": {
"from_me": False,
"timestamp": 1463926641.571629,
"time": "10:17",
"media": False,
"key_id": "34B5EF10FBCA37B8E",
"meta": False,
"data": "Great to see you",
"sender": None,
"safe": False,
"mime": None,
"reply": None,
"quoted_data": None,
'reactions': {},
"caption": None,
"thumb": None,
"sticker": False,
"message_type": None,
"received_timestamp": None,
"read_timestamp": None
},
"24692": {
"from_me": False,
"timestamp": 1463926642.571629,
"time": "10:17",
"media": False,
"key_id": "34B5EF10FBCA37B9E",
"meta": False,
"data": "Hi there!",
"sender": None,
"safe": False,
"mime": None,
"reply": None,
"quoted_data": None,
'reactions': {},
"caption": None,
"thumb": None,
"sticker": False,
"message_type": None,
"received_timestamp": None,
"read_timestamp": None
},
}
}
}
@pytest.fixture
def mock_filesystem():
with (
patch("os.path.exists") as mock_exists,
patch("os.makedirs") as mock_makedirs,
patch("os.path.getmtime") as mock_getmtime,
patch("os.listdir") as mock_listdir,
patch("os.walk") as mock_walk,
patch("shutil.copy2") as mock_copy2,
):
yield {
"exists": mock_exists,
"makedirs": mock_makedirs,
"getmtime": mock_getmtime,
"listdir": mock_listdir,
"walk": mock_walk,
"copy2": mock_copy2,
}
def test_incremental_merge_new_file(mock_filesystem):
"""Test merging when target file doesn't exist"""
source_dir = "/source"
target_dir = "/target"
media_dir = "media"
# Setup mock filesystem
mock_filesystem["exists"].side_effect = lambda x: x == "/source"
mock_filesystem["listdir"].return_value = ["chat.json"]
# Run the function
incremental_merge(source_dir, target_dir, media_dir, 2, True)
# Verify the operations
mock_filesystem["makedirs"].assert_called_once_with(target_dir, exist_ok=True)
mock_filesystem["copy2"].assert_called_once_with(
os.path.join(source_dir, "chat.json"),
os.path.join(target_dir, "chat.json")
)
def test_incremental_merge_existing_file_with_changes(mock_filesystem):
"""Test merging when target file exists and has changes"""
source_dir = "source"
target_dir = "target"
media_dir = "media"
# Setup mock filesystem
mock_filesystem["exists"].side_effect = lambda x: True
mock_filesystem["listdir"].return_value = ["chat.json"]
# Mock file operations with consistent path separators
source_file = os.path.join(source_dir, "chat.json")
target_file = os.path.join(target_dir, "chat.json")
mock_file_content = {
source_file: json.dumps(chat_data_2),
target_file: json.dumps(chat_data_1),
}
written_chunks = []
def mock_file_write(data):
written_chunks.append(data)
mock_write = MagicMock(side_effect=mock_file_write)
with patch("builtins.open", mock_open()) as mock_file:
def mock_file_read(filename, mode="r"):
if mode == 'w':
file_mock = mock_open().return_value
file_mock.write.side_effect = mock_write
return file_mock
else:
# Use normalized path for lookup
norm_filename = os.path.normpath(filename)
content = mock_file_content.get(norm_filename, '')
file_mock = mock_open(read_data=content).return_value
return file_mock
mock_file.side_effect = mock_file_read
# Run the function
incremental_merge(source_dir, target_dir, media_dir, 2, True)
# Verify file operations using os.path.join
mock_file.assert_any_call(source_file, "r")
mock_file.assert_any_call(target_file, "r")
mock_file.assert_any_call(target_file, "w")
# Rest of verification code...
assert mock_write.called, "Write method was never called"
written_data = json.loads(''.join(written_chunks))
assert written_data is not None, "No data was written"
assert written_data == chat_data_merged, "Merged data does not match expected result"
messages = written_data["12345678@s.whatsapp.net"]["messages"]
assert "24690" in messages, "Common message should be present"
assert "24691" in messages, "Target-only message should be preserved"
assert "24692" in messages, "Source-only message should be added"
assert len(messages) == 3, "Should have exactly 3 messages"
def test_incremental_merge_existing_file_no_changes(mock_filesystem):
"""Test merging when target file exists but has no changes"""
source_dir = "source"
target_dir = "target"
media_dir = "media"
# Setup mock filesystem
mock_filesystem["exists"].side_effect = lambda x: True
mock_filesystem["listdir"].return_value = ["chat.json"]
# Mock file operations with consistent path separators
source_file = os.path.join(source_dir, "chat.json")
target_file = os.path.join(target_dir, "chat.json")
mock_file_content = {
source_file: json.dumps(chat_data_1),
target_file: json.dumps(chat_data_1),
}
with patch("builtins.open", mock_open()) as mock_file:
def mock_file_read(filename, mode="r"):
if mode == 'w':
file_mock = mock_open().return_value
return file_mock
else:
# Use normalized path for lookup
norm_filename = os.path.normpath(filename)
content = mock_file_content.get(norm_filename, '')
file_mock = mock_open(read_data=content).return_value
return file_mock
mock_file.side_effect = mock_file_read
# Run the function
incremental_merge(source_dir, target_dir, media_dir, 2, True)
# Verify no write operations occurred on target file
write_calls = [
call for call in mock_file.mock_calls if call[0] == "().write"]
assert len(write_calls) == 0
def test_incremental_merge_media_copy(mock_filesystem):
"""Test media file copying during merge"""
source_dir = "source"
target_dir = "target"
media_dir = "media"
# Setup mock filesystem
mock_filesystem["exists"].side_effect = lambda x: True
mock_filesystem["listdir"].return_value = ["chat.json"]
mock_filesystem["walk"].return_value = [
(os.path.join(source_dir, "media"), ["subfolder"], ["file1.jpg"]),
(os.path.join(source_dir, "media", "subfolder"), [], ["file2.jpg"]),
]
mock_filesystem["getmtime"].side_effect = lambda x: 1000 if "source" in x else 500
# Mock file operations with consistent path separators
source_file = os.path.join(source_dir, "chat.json")
target_file = os.path.join(target_dir, "chat.json")
mock_file_content = {
source_file: json.dumps(chat_data_1),
target_file: json.dumps(chat_data_1),
}
with patch("builtins.open", mock_open()) as mock_file:
def mock_file_read(filename, mode="r"):
if mode == 'w':
file_mock = mock_open().return_value
return file_mock
else:
# Use normalized path for lookup
norm_filename = os.path.normpath(filename)
content = mock_file_content.get(norm_filename, '')
file_mock = mock_open(read_data=content).return_value
return file_mock
mock_file.side_effect = mock_file_read
# Run the function
incremental_merge(source_dir, target_dir, media_dir, 2, True)
# Verify media file operations
assert mock_filesystem["makedirs"].call_count >= 2 # At least target dir and media dir
assert mock_filesystem["copy2"].call_count == 2 # Two media files copied

View File

@@ -0,0 +1,76 @@
import os
import sys
import pytest
import subprocess
@pytest.fixture
def command_runner():
"""
A pytest fixture to simplify running commands. This is a helper
function that you can use in multiple tests.
"""
def _run_command(command_list, check=True):
"""
Runs a command and returns the result.
Args:
command_list (list): A list of strings representing the command
and its arguments (e.g., ["python", "my_script.py", "arg1"]).
check (bool, optional): If True, raise an exception if the
command returns a non-zero exit code. Defaults to True.
Returns:
subprocess.CompletedProcess: The result of the command.
"""
return subprocess.run(
command_list,
capture_output=True,
text=True,
check=check,
)
return _run_command
def test_nuitka_binary():
"""
Tests the creation and execution of a Nuitka-compiled binary.
"""
if sys.version_info >= (3, 14):
print("Skipping Nuitka test: Python 3.14 is not yet fully supported by Nuitka.")
return
nuitka_command = [
"python", "-m", "nuitka", "--onefile", "--assume-yes-for-downloads",
"--include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html",
"Whatsapp_Chat_Exporter",
"--output-filename=wtsexporter.exe" # use .exe on all platforms for compatibility
]
compile_result = subprocess.run(
nuitka_command,
capture_output=True,
text=True,
check=True
)
print(f"Nuitka compilation output: {compile_result.stdout}")
binary_path = "./wtsexporter.exe"
assert os.path.exists(binary_path), f"Binary {binary_path} was not created."
try:
execute_result = subprocess.run(
[binary_path, "--help"],
capture_output=True,
text=True,
check=True,
)
print(f"Binary execution output: {execute_result.stdout}")
assert "usage:" in execute_result.stdout.lower(), "Binary did not produce expected help output."
except subprocess.CalledProcessError as e:
print(f"Binary execution failed with error: {e.stderr}")
raise
finally:
if os.path.exists(binary_path):
os.remove(binary_path)

352
tests/test_utility.py Normal file
View File

@@ -0,0 +1,352 @@
import pytest
import random
import string
from unittest.mock import patch, mock_open, MagicMock
from Whatsapp_Chat_Exporter.utility import *
def test_convert_time_unit():
assert convert_time_unit(0) == "less than a second"
assert convert_time_unit(1) == "a second"
assert convert_time_unit(10) == "10 seconds"
assert convert_time_unit(60) == "1 minute"
assert convert_time_unit(61) == "1 minute 1 second"
assert convert_time_unit(122) == "2 minutes 2 seconds"
assert convert_time_unit(3600) == "1 hour"
assert convert_time_unit(3661) == "1 hour 1 minute 1 second"
assert convert_time_unit(3720) == "1 hour 2 minutes"
assert convert_time_unit(3660) == "1 hour 1 minute"
assert convert_time_unit(7263) == "2 hours 1 minute 3 seconds"
assert convert_time_unit(86400) == "1 day"
assert convert_time_unit(86461) == "1 day 1 minute 1 second"
assert convert_time_unit(172805) == "2 days 5 seconds"
class TestBytesToReadable:
assert bytes_to_readable(0) == "0 B"
assert bytes_to_readable(500) == "500 B"
assert bytes_to_readable(1024) == "1.0 KB"
assert bytes_to_readable(2048) == "2.0 KB"
assert bytes_to_readable(1536) == "1.5 KB"
assert bytes_to_readable(1024**2) == "1.0 MB"
assert bytes_to_readable(5 * 1024**2) == "5.0 MB"
assert bytes_to_readable(1024**3) == "1.0 GB"
assert bytes_to_readable(1024**4) == "1.0 TB"
assert bytes_to_readable(1024**5) == "1.0 PB"
assert bytes_to_readable(1024**6) == "1.0 EB"
assert bytes_to_readable(1024**7) == "1.0 ZB"
assert bytes_to_readable(1024**8) == "1.0 YB"
class TestReadableToBytes:
def test_conversion(self):
assert readable_to_bytes("0B") == 0
assert readable_to_bytes("100B") == 100
assert readable_to_bytes("50 B") == 50
assert readable_to_bytes("1KB") == 1024
assert readable_to_bytes("2.5 KB") == 2560
assert readable_to_bytes("2.0 KB") == 2048
assert readable_to_bytes("1MB") == 1024**2
assert readable_to_bytes("0.5 MB") == 524288
assert readable_to_bytes("1. MB") == 1048576
assert readable_to_bytes("1GB") == 1024**3
assert readable_to_bytes("1.GB") == 1024**3
assert readable_to_bytes("1TB") == 1024**4
assert readable_to_bytes("1PB") == 1024**5
assert readable_to_bytes("1EB") == 1024**6
assert readable_to_bytes("1ZB") == 1024**7
assert readable_to_bytes("1YB") == 1024**8
def test_case_insensitivity(self):
assert readable_to_bytes("1kb") == 1024
assert readable_to_bytes("2mB") == 2 * 1024**2
def test_whitespace(self):
assert readable_to_bytes(" 10 KB ") == 10 * 1024
assert readable_to_bytes(" 1 MB") == 1024**2
def test_invalid_unit(self):
with pytest.raises(ValueError, match="Invalid size format for size_str"):
readable_to_bytes("100X")
readable_to_bytes("A100")
readable_to_bytes("100$$$$$")
def test_invalid_number(self):
with pytest.raises(ValueError, match="Invalid size format for size_str"):
readable_to_bytes("ABC KB")
def test_missing_unit(self):
assert readable_to_bytes("100") == 100
class TestSanitizeExcept:
def test_no_tags(self):
html = "This is plain text."
assert sanitize_except(html) == Markup("This is plain text.")
def test_allowed_br_tag(self):
html = "Line 1<br>Line 2"
assert sanitize_except(html) == Markup("Line 1<br>Line 2")
html = "<br/>Line"
assert sanitize_except(html) == Markup("<br>Line")
html = "Line<br />"
assert sanitize_except(html) == Markup("Line<br>")
def test_mixed_tags(self):
html = "<b>Bold</b><br><i>Italic</i><img src='evil.gif'><script>alert('XSS')</script>"
assert sanitize_except(html) == Markup(
"&lt;b&gt;Bold&lt;/b&gt;<br>&lt;i&gt;Italic&lt;/i&gt;&lt;img src='evil.gif'&gt;&lt;script&gt;alert('XSS')&lt;/script&gt;")
def test_attribute_stripping(self):
html = "<br class='someclass'>"
assert sanitize_except(html) == Markup("<br>")
class TestDetermineDay:
def test_same_day(self):
timestamp1 = 1678838400 # March 15, 2023 00:00:00 GMT
timestamp2 = 1678881600 # March 15, 2023 12:00:00 GMT
assert determine_day(timestamp1, timestamp2) is None
def test_different_day(self):
timestamp1 = 1678886400 # March 15, 2023 00:00:00 GMT
timestamp2 = 1678972800 # March 16, 2023 00:00:00 GMT
assert determine_day(timestamp1, timestamp2) == datetime(2023, 3, 16).date()
def test_crossing_month(self):
timestamp1 = 1680220800 # March 31, 2023 00:00:00 GMT
timestamp2 = 1680307200 # April 1, 2023 00:00:00 GMT
assert determine_day(timestamp1, timestamp2) == datetime(2023, 4, 1).date()
def test_crossing_year(self):
timestamp1 = 1703980800 # December 31, 2023 00:00:00 GMT
timestamp2 = 1704067200 # January 1, 2024 00:00:00 GMT
assert determine_day(timestamp1, timestamp2) == datetime(2024, 1, 1).date()
class TestGetFileName:
def test_valid_contact_phone_number_no_chat_name(self):
chat = ChatStore(Device.ANDROID, name=None)
filename, name = get_file_name("1234567890@s.whatsapp.net", chat)
assert filename == "1234567890"
assert name == "1234567890"
def test_valid_contact_phone_number_with_chat_name(self):
chat = ChatStore(Device.IOS, name="My Chat Group")
filename, name = get_file_name("1234567890@s.whatsapp.net", chat)
assert filename == "1234567890-My-Chat-Group"
assert name == "My Chat Group"
def test_valid_contact_exported_chat(self):
chat = ChatStore(Device.ANDROID, name="Testing")
filename, name = get_file_name("ExportedChat", chat)
assert filename == "ExportedChat-Testing"
assert name == "Testing"
def test_valid_contact_special_ids(self):
chat = ChatStore(Device.ANDROID, name="Special Chat")
filename_000, name_000 = get_file_name("000000000000000", chat)
assert filename_000 == "000000000000000-Special-Chat"
assert name_000 == "Special Chat"
filename_001, name_001 = get_file_name("000000000000001", chat)
assert filename_001 == "000000000000001-Special-Chat"
assert name_001 == "Special Chat"
def test_unexpected_contact_format(self):
chat = ChatStore(Device.ANDROID, name="Some Chat")
with pytest.raises(ValueError, match="Unexpected contact format: invalid-contact"):
get_file_name("invalid-contact", chat)
def test_contact_with_hyphen_and_chat_name(self):
chat = ChatStore(Device.ANDROID, name="Another Chat")
filename, name = get_file_name("123-456-7890@g.us", chat)
assert filename == "Another-Chat"
assert name == "Another Chat"
def test_contact_with_hyphen_no_chat_name(self):
chat = ChatStore(Device.ANDROID, name=None)
filename, name = get_file_name("123-456-7890@g.us", chat)
assert filename == "123-456-7890"
assert name == "123-456-7890"
class TestGetCondForEmpty:
def test_enable_true(self):
condition = get_cond_for_empty(True, "c.jid", "c.broadcast")
assert condition == "AND (chat.hidden=0 OR c.jid='status@broadcast' OR c.broadcast>0)"
def test_enable_false(self):
condition = get_cond_for_empty(False, "other_jid", "other_broadcast")
assert condition == ""
class TestGetChatCondition:
...
class TestGetStatusLocation:
@patch('os.path.isdir')
@patch('os.path.isfile')
@patch('os.mkdir')
@patch('urllib.request.urlopen')
@patch('builtins.open', new_callable=mock_open)
def test_offline_static_set(self, mock_open_file, mock_urlopen, mock_mkdir, mock_isfile, mock_isdir):
mock_isdir.return_value = False
mock_isfile.return_value = False
mock_response = MagicMock()
mock_response.read.return_value = b'W3.CSS Content'
mock_urlopen.return_value.__enter__.return_value = mock_response
output_folder = "output_folder"
offline_static = "offline_static"
result = get_status_location(output_folder, offline_static)
assert result == os.path.join(offline_static, "w3.css")
mock_mkdir.assert_called_once_with(os.path.join(output_folder, offline_static))
mock_urlopen.assert_called_once_with("https://www.w3schools.com/w3css/4/w3.css")
mock_open_file.assert_called_once_with(os.path.join(output_folder, offline_static, "w3.css"), "wb")
mock_open_file().write.assert_called_once_with(b'W3.CSS Content')
def test_offline_static_not_set(self):
result = get_status_location("output_folder", "")
assert result == "https://www.w3schools.com/w3css/4/w3.css"
class TestSafeName:
def generate_random_string(length=50):
random.seed(10)
return ''.join(random.choice(string.ascii_letters + string.digits + "äöüß") for _ in range(length))
safe_name_test_cases = [
("This is a test string", "This-is-a-test-string"),
("This is a test string with special characters!@#$%^&*()",
"This-is-a-test-string-with-special-characters"),
("This is a test string with numbers 1234567890", "This-is-a-test-string-with-numbers-1234567890"),
("This is a test string with mixed case ThisIsATestString",
"This-is-a-test-string-with-mixed-case-ThisIsATestString"),
("This is a test string with extra spaces \u00A0 \u00A0 \u00A0 ThisIsATestString",
"This-is-a-test-string-with-extra-spaces-ThisIsATestString"),
("This is a test string with unicode characters äöüß",
"This-is-a-test-string-with-unicode-characters-äöüß"),
("這是一個包含中文的測試字符串", "這是一個包含中文的測試字符串"), # Chinese characters, should stay as is
(
f"This is a test string with long length {generate_random_string(1000)}",
f"This-is-a-test-string-with-long-length-{generate_random_string(1000)}",
),
("", ""), # Empty string
(" ", ""), # String with only space
("---", "---"), # String with only hyphens
("___", "___"), # String with only underscores
("a" * 100, "a" * 100), # Long string with single character
("a-b-c-d-e", "a-b-c-d-e"), # String with hyphen
("a_b_c_d_e", "a_b_c_d_e"), # String with underscore
("a b c d e", "a-b-c-d-e"), # String with spaces
("test.com/path/to/resource?param1=value1&param2=value2",
"test.compathtoresourceparam1value1param2value2"), # Test with URL
("filename.txt", "filename.txt"), # Test with filename
("Αυτή είναι μια δοκιμαστική συμβολοσειρά με ελληνικούς χαρακτήρες.",
"Αυτή-είναι-μια-δοκιμαστική-συμβολοσειρά-με-ελληνικούς-χαρακτήρες."), # Greek characters
("This is a test with комбинированные знаки ̆ example",
"This-is-a-test-with-комбинированные-знаки-example") # Mixed with unicode
]
@pytest.mark.parametrize("input_text, expected_output", safe_name_test_cases)
def test_safe_name(self, input_text, expected_output):
result = safe_name(input_text)
assert result == expected_output
class TestGetChatCondition:
def test_no_filter(self):
"""Test when filter is None"""
result = get_chat_condition(None, True, ["column1", "column2"])
assert result == ""
result = get_chat_condition(None, False, ["column1"])
assert result == ""
def test_include_single_chat_single_column(self):
"""Test including a single chat with single column"""
result = get_chat_condition(["1234567890"], True, ["phone"])
assert result == "AND ( phone LIKE '%1234567890%')"
def test_include_multiple_chats_single_column(self):
"""Test including multiple chats with single column"""
result = get_chat_condition(["1234567890", "0987654321"], True, ["phone"])
assert result == "AND ( phone LIKE '%1234567890%' OR phone LIKE '%0987654321%')"
def test_exclude_single_chat_single_column(self):
"""Test excluding a single chat with single column"""
result = get_chat_condition(["1234567890"], False, ["phone"])
assert result == "AND ( phone NOT LIKE '%1234567890%')"
def test_exclude_multiple_chats_single_column(self):
"""Test excluding multiple chats with single column"""
result = get_chat_condition(["1234567890", "0987654321"], False, ["phone"])
assert result == "AND ( phone NOT LIKE '%1234567890%' AND phone NOT LIKE '%0987654321%')"
def test_include_with_jid_android(self):
"""Test including chats with JID for Android platform"""
result = get_chat_condition(["1234567890"], True, ["phone", "name"], "jid", "android")
assert result == "AND ( phone LIKE '%1234567890%' OR (name LIKE '%1234567890%' AND jid.type == 1))"
def test_include_with_jid_ios(self):
"""Test including chats with JID for iOS platform"""
result = get_chat_condition(["1234567890"], True, ["phone", "name"], "jid", "ios")
assert result == "AND ( phone LIKE '%1234567890%' OR (name LIKE '%1234567890%' AND jid IS NOT NULL))"
def test_exclude_with_jid_android(self):
"""Test excluding chats with JID for Android platform"""
result = get_chat_condition(["1234567890"], False, ["phone", "name"], "jid", "android")
assert result == "AND ( phone NOT LIKE '%1234567890%' AND (name NOT LIKE '%1234567890%' AND jid.type == 1))"
def test_exclude_with_jid_ios(self):
"""Test excluding chats with JID for iOS platform"""
result = get_chat_condition(["1234567890"], False, ["phone", "name"], "jid", "ios")
assert result == "AND ( phone NOT LIKE '%1234567890%' AND (name NOT LIKE '%1234567890%' AND jid IS NOT NULL))"
def test_multiple_chats_with_jid_android(self):
"""Test multiple chats with JID for Android platform"""
result = get_chat_condition(["1234567890", "0987654321"], True, ["phone", "name"], "jid", "android")
expected = "AND ( phone LIKE '%1234567890%' OR (name LIKE '%1234567890%' AND jid.type == 1) OR phone LIKE '%0987654321%' OR (name LIKE '%0987654321%' AND jid.type == 1))"
assert result == expected
def test_multiple_chats_exclude_with_jid_android(self):
"""Test excluding multiple chats with JID for Android platform"""
result = get_chat_condition(["1234567890", "0987654321"], False, ["phone", "name"], "jid", "android")
expected = "AND ( phone NOT LIKE '%1234567890%' AND (name NOT LIKE '%1234567890%' AND jid.type == 1) AND phone NOT LIKE '%0987654321%' AND (name NOT LIKE '%0987654321%' AND jid.type == 1))"
assert result == expected
def test_invalid_column_count_with_jid(self):
"""Test error when column count is less than 2 but jid is provided"""
with pytest.raises(ValueError, match="There must be at least two elements in argument columns if jid is not None"):
get_chat_condition(["1234567890"], True, ["phone"], "jid", "android")
def test_unsupported_platform(self):
"""Test error when unsupported platform is provided"""
with pytest.raises(ValueError, match="Only android and ios are supported for argument platform if jid is not None"):
get_chat_condition(["1234567890"], True, ["phone", "name"], "jid", "windows")
def test_empty_filter_list(self):
"""Test with empty filter list"""
result = get_chat_condition([], True, ["phone"])
assert result == ""
result = get_chat_condition([], False, ["phone"])
assert result == ""
def test_filter_with_empty_strings(self):
"""Test with filter containing empty strings"""
result = get_chat_condition(["", "1234567890"], True, ["phone"])
assert result == "AND ( phone LIKE '%%' OR phone LIKE '%1234567890%')"
result = get_chat_condition([""], True, ["phone"])
assert result == "AND ( phone LIKE '%%')"
def test_special_characters_in_filter(self):
"""Test with special characters in filter values"""
result = get_chat_condition(["test@example.com"], True, ["email"])
assert result == "AND ( email LIKE '%test@example.com%')"
result = get_chat_condition(["user-name"], True, ["username"])
assert result == "AND ( username LIKE '%user-name%')"

View File

@@ -0,0 +1,48 @@
# from contacts_names_from_vcards import readVCardsFile
import os
from Whatsapp_Chat_Exporter.vcards_contacts import normalize_number, read_vcards_file
def test_readVCardsFile():
data_dir = os.path.join(os.path.dirname(__file__), "data")
data = read_vcards_file(os.path.join(data_dir, "contacts.vcf"), "852")
if data:
print("Found Names")
print("-----------------------")
for count, contact_tuple in enumerate(data, start=1):
# The name is the second element of the tuple (at index 1)
name = contact_tuple[1]
# Print the count and the name
print(f"{count}. {name}")
print(data)
assert len(data) == 6
# Test simple contact name
assert data[0][1] == "Sample Contact"
# Test complex name
assert data[1][1] == "Yard Lawn Guy, Jose Lopez"
# Test name with emoji
assert data[2][1] == "John Butler 🌟💫🌟"
# Test note with multi-line encoding
assert data[3][1] == "Airline Contact #'s"
# Test address with multi-line encoding
assert data[4][1] == "James Peacock Elementary"
# Test business entry using ORG but not F/FN
assert data[5][1] == "AAA Car Service"
def test_create_number_to_name_dicts():
pass
def test_fuzzy_match_numbers():
pass
def test_normalize_number():
assert normalize_number('0531234567', '1') == '1531234567'
assert normalize_number('001531234567', '2') == '1531234567'
assert normalize_number('+1531234567', '34') == '1531234567'
assert normalize_number('053(123)4567', '34') == '34531234567'
assert normalize_number('0531-234-567', '58') == '58531234567'