Commit graph

42 commits

Author SHA1 Message Date
Cristian
572b46cecf lint: Remove unused imports 2020-10-23 06:45:56 -05:00
Cristian
ae1484b8bf feat: Remove index.json and index.html generation from the regular process 2020-10-23 06:45:56 -05:00
Angel Rey
ad04fb5300 Replaced os.path in init index 2020-10-02 15:46:39 -05:00
Cristian
b18bbf8874 test: Fix tests post-rebase 2020-09-17 09:09:52 -05:00
apkallum
b99784b919 pathlib with / syntax for config, index 2020-09-17 09:09:52 -05:00
apkallum
594d9e49ce first attempt to migrate to Pathlib 2020-09-17 09:09:52 -05:00
Cristian Vargas
5e9b3099c6 Update fix_duplicate_links_in_index docstring
Co-authored-by: Nick Sweeting <git@sweeting.me>
2020-09-15 08:05:46 -05:00
Cristian
f55153eab3 feat: Update update command to work with querysets 2020-09-15 08:05:46 -05:00
Cristian
fe9604a772 feat: Add tests for remove command 2020-09-15 08:05:46 -05:00
Cristian
a8ed72501d feat: Refactor remove command to use querysets 2020-09-15 08:05:46 -05:00
Cristian
be520d137a feat: Refactor add method to use querysets 2020-09-15 08:05:46 -05:00
Cristian
be0dff8126 feat: Add tests to refactored init command 2020-09-15 08:05:46 -05:00
Cristian
404f333e17 feat: Refactor get_invalid_folders to work with a queryset instead of a list of links 2020-09-15 08:05:46 -05:00
Cristian
6b4b7127b4 feat: Remove unused imports 2020-09-15 08:05:46 -05:00
Cristian
b8585dd92e feat: load_main_index returns a queryset now 2020-09-15 08:05:46 -05:00
Cristian
c16fdf1b47 feat: Update data folder check 2020-09-15 08:05:46 -05:00
Cristian
874403e667 feat: Remove patch_main_index 2020-09-15 08:05:46 -05:00
Cristian
31343c1367 feat: Update extractors and add command to use sql index as source of truth 2020-09-15 08:05:46 -05:00
Cristian
02f36b2096 feat: Replace index.json with index.sql as the main index in init 2020-09-15 08:05:46 -05:00
Nick Sweeting
e87f1d57a3 fix linters 2020-08-18 09:22:12 -04:00
Nick Sweeting
f18d92570e wip attempt to fix timestamp unique constraint errors 2020-08-18 08:30:09 -04:00
Nick Sweeting
15efb2d5ed new generic_html parser for extracting hrefs 2020-08-18 08:29:05 -04:00
Nick Sweeting
5f84a7bc6e better handle the case where json index lags behind sql index 2020-08-18 08:13:13 -04:00
Nick Sweeting
77d2f08a5c show more info in merge conflict error message 2020-08-18 08:12:35 -04:00
Nick Sweeting
f371032b71 show warning when killing archivebox during index writing 2020-08-18 04:38:29 -04:00
Nick Sweeting
225b63b732 skip invalid urls at all stages 2020-08-17 03:12:17 -04:00
Cristian
c073ea141d feat: Initial oneshot command proposal 2020-07-29 11:19:06 -05:00
Cristian
6006b4f93b refactor: Organize code to remove flake8 issues 2020-07-24 12:25:25 -05:00
Cristian
100fa5d1f5 fix: Guess timestamps and add placeholders to support older indices 2020-07-24 09:24:52 -05:00
Nick Sweeting
02a2fefbba
Merge pull request #385 from apkallum/origin/output-permissions 2020-07-23 11:52:31 -04:00
apkallum
0ed2a23670 ensure correct permissions for output folder 2020-07-23 10:28:10 -04:00
Cristian
71f5f03a20 fix: Add notice for issues with index detail 2020-07-22 17:08:32 -05:00
Cristian
a5550b2105 fix: Rename logging folder to avoid naming conflicts (and circular import issues) 2020-07-22 11:02:13 -05:00
Cristian
f4d1b5121e refactor: Move logging.py to main module to avoid circular import issues 2020-07-17 18:00:04 -05:00
Cristian
5e2bf73f04 fix: Bugs related to add() refactor 2020-07-13 14:48:25 -05:00
Nick Sweeting
d3bfa98a91 fix depth flag and tweak logging 2020-07-13 11:26:34 -04:00
Nick Sweeting
dda3542d60 bump sql updated time after every link details save 2020-06-30 13:45:47 -04:00
Nick Sweeting
cb67b09f9d Merge branch 'master' into django 2020-06-25 21:30:29 -04:00
Nick Sweeting
ecfca13b6d fix present folders docstring 2019-05-02 15:20:21 -04:00
Nick Sweeting
1ac99621ab show progress during validate_links 2019-05-01 02:28:26 -04:00
Nick Sweeting
95007d9137 split up utils into separate files 2019-04-30 23:13:04 -04:00
Nick Sweeting
1b8abc0961 move everything out of legacy folder 2019-04-27 17:26:24 -04:00
Renamed from archivebox/legacy/index.py (Browse further)