Why does the app need Symlinks?

May 15, 2016 at 9:30 AM
I just started writing my own (very basic) Powershell script to do backups with 7-zip and then found your project. No need to reinvent the wheel right?

Even though I am a powershell novice, I am quite curious as to why Symlinks are required? And the flow on effect of needing to set friendly container names.

It' just an interesting design decision I would like to enquire about, because it just seems unnecessarily complicated to my untrained eye.

Keep up the good work otherwise Andrea!
Coordinator
May 15, 2016 at 11:42 AM
Edited May 15, 2016 at 11:57 AM
Hi SMG1,
the need for symlinks endorsed by the script came out from older versions of 7-Zip: I was not able to force 7-zip to archive from different sources due to an "Duplicate file name error". You can find find a sample here : http://stackoverflow.com/questions/12675246/zipping-files-with-the-same-name-in-different-folders-using-7z-listfile-feature

Having 7-Zip to parse a list of files to archive where paths were always relative to a single root seemed to solve the issue: so i resolved to create a virtual directory tree (with one root only holding junctions to all different sources) making everything relative.

Friendly names (or aliases) where an effort to give, of course, a meaningful label within the archive but also to shorten, as much as possible, the full path of files: so a directory named "Blablablablablablablablablablabla" could be shortened in the archive without touching it's original name in the filesystem.

From that initial effort the script came out with new directives (mostly based on regular expressions) which help the user to perform additional actions during file discovery (eg. delete unneeded files) or to create complex filtering rules to archive only what is really needed. The script also optionally keeps directory structure creating dummy files in empty directories which, otherwise, would never be archived by 7-zip.

Hope this helps to clarify your questions.

All the best.

Andrea
May 15, 2016 at 12:51 PM
Edited May 15, 2016 at 1:07 PM
Thanks for the insight. It sounds like a novel way to work around the problem without extra disk space requirements.

I just tested it out with previous stable version 15.14 x64 (latest stable is 16.00, just released) and I could not reproduce this bug anymore unless my listfile.txt contained lines pointing to the exact same filename

eg1:
C:\Test\src1\file1.txt
C:\Test\src2\file1.txt
In which case it would be attempting to put all the files in those directories in the root of the same archive (instead of keeping their folder structure)

I then removed the duplicates and ran it again:
eg2:
C:\Test\src1\file1.txt
C:\Test\src2\file2.txt
And confirmed that this method just dumps everything into the same folder.

If I kept my original folder structure in place (with duplicates) and then changed the listfile.txt to point to directories instead
eg3:
C:\Test\src1
C:\Test\src2
Or used relative paths
eg4:
\src1\file1.txt
\src2\file1.txt
Then it archived both directories fine with all files in them under their respective folders.

It would make sense that it can't have the same file names in the same folder, but there must have been a bug prior to this version that would have made by second example impossible.

Also 7-zip was now able to back up empty directories fine too, no dummy files needed.

It would be ideal if your little utility could do away with symlinks if they aren't needed anymore, it would remove the administrator privileges requirement too. But it looks like a big job to rework it.

I had a look at your code. You've done an amazing job, but unfortunately a bit beyond my skill level to even attempt to keep the same code quality. I would totally butcher it.

Thanks for all your work.

edit: I just got eg1 to work with -spf2 flag also.
eg5:
7z a -spf2 test.7z @listfile.txt
Coordinator
May 15, 2016 at 1:45 PM
Yes thank you,
the issue has probably gone in newer 7zip versions. Nevertheless keep in mind that:
  1. The script endorses several selection directives (inclusion and or exclusions) which go far beyond the embedded selection wild cards used by 7zip
  2. When using in 7zip the -r switch (recursion) it will dig into ALL the subdirectories of a given source (regardless their depth level and or name). Using the script you can skip levels or directory names and or honor max depth levels
  3. 7zip does not allow you to select files to archive upon the "Archive attribute" criteria out-of-the-box : therefore you cannot implement backup strategies like the differential or incremental
  4. Related to the above 7zip will not clear archive bit on succesfully archived files
  5. 7zip can't remove unneeded files during the scan process
  6. 7zip can't send detailed email report of what has done (it can of course send you the entire archive)
... and a lot more I am actually working on (Integration with event logs, auto switch off of computer after succesful operation, parallel and async selection ...)

Long story short: I love 7zip but it's specifically designed to archive and compress. All the logic for an extensive selection and implementation of archiving strategies must be wrapped around it from the outside.

You suggestion to get rid of symlinks however would break compatibility with "old" versions of 7zip which are still widely used on "stable" servers.
Anyway you do not need Admin privileges to run the script: what is really needed is to run it under credentials which has write access to an NTFS drive (and of course to any source dirctory if you want to clear archive bits or delete files). If it's not system drive you don't need to be Administrator. Use the --workdrive command line switch to set eg. the D drive.
May 15, 2016 at 1:50 PM
OK Thank you very much.

I agree that 7z is very powerful for compression, but has nothing on being a backup solution by itself.

Please tell me, does your script have any protections to catch file paths which may be longer than 260 characters (and thus fail), particularly in the log and email log?
Coordinator
May 15, 2016 at 1:53 PM
Yes it does.

In the log you will find all alerts generated by a PathTooLong Exceptions.