RE: Failed to post to CodePlex.com project 7zbackup

Feb 2, 2010 at 11:09 PM

Hello Anlan,

Thanks a lot for your quick response. I think your script is a very good basis and that is why I see good use it for multiple purposes in our environment.

Please see my comments below.

From: Anlan [mailto:notifications@codeplex.com]
Sent: maandag 1 februari 2010 13:58
To: codeplex@scheppink.com
Subject: Re: Can we move to archive through this script? [7zbackup:82718]

From: Anlan

Hi and thank you for your appreciated post.

As you have already spotted the script came in my mind for backup purpouses. Nevertheless the "archive" option might be ... an option as well.

So ... let's follow your points and feel free to amend me if I misintepret your needs:

* Source Path : actually the selection criteria is very flexible in the definition of where to search files being backed-up. If you look at "Prepare your first selection criteria" paragraph depicted in the Documentation page you can find several commented directives which can help defining even complex logic in path/file selection (file ages, regular expressions over names and paths, extensions, whether to follow junctions or not etc.). In addition to that there is always an implied selection condition which relies over the presence (or not) of the "Archive" attribute on files which is driven by the type of backup your're performig. *There is a limitation tough*: as the script "builds" a virtual root of directories by the means of Junctions ... *as NTFS Junctions can not point to remote UNC shares*, this script will be naturally limited to the boundaries of the server (or pc) where it is running.

I see some workarounds in having each server doing it’s own backup/archiving but ideally it would be centrally managed (by IT) on the server where all the archives are created/kept. Would it be possible to build the structure on UNC path names or as we just found that using MKLINK (available in Vista/Win7 en Win2k8) supports different types of soft- and hard links/junctions also to UNC names.

In the long run UNC should always work correctly in regard to sources and targets but I also know that not all command line tools support this.

* Email Alerts : at present the script is already able to send a detailed report of the operation to one or more email addresses. You might find useful information in the hardcoded-vars file.

I will have a good look at that but what I ment was to be able to send e-mail (to one or more addressees) per job/line in the CSV. We have situations where we want some users to know of the archiving done without IT involvement. This “could” (maybe a default address with the option to overwrite add extra addresses per job/line?) I am not sure if this is already possible.

* Verbosity : The report being produced by the script is, necessarly, detailed as we most often use it to control remote backups being performed nightly. Therefore we do need a comprehensive report via email in one shot rather than go to the remote machine and look for exceptions. Nevertheless your suggestion brings to mi mind the option to detail logs to events at different levels of verbosity. Does it make sense to you ?

It does indeed and combining that with an addressee per job/line would be ideal. (Some users only need to know if there were any errors and other want the details)

We would prefer to have one process doing “all” the archiving so it will be serialized and we will not have to worry about multiple scripts running into each other. In that case we need to report to different people for different jobs. Some of the jobs will be to support systems administrators (network and OS) and others will be to support ERP admins.

I just found that I was mixing up two solutions and attached the file purge-log.txt which is doing some things like I suggested but it does not do the backup. Ideally I would like to see this way of configuring a To-Do list in a .csv or .XML file and have the option to select backup --type PURGE (First backup into zip and then delete de originals.)

I think the only aspect which is not covered at all by the script is the option to delete source files. For this topic I must admit I have not completely understood your request: the first thing that came to my mind was you might expect to have source files deleted *after* a successful backup operation, but when reading your post I see you write "Filemask for deleting files before archiving". Could you please clarify this ?

Actually I was thinking of two types of ‘deletions’.

1 Clean up files (like *.TMP or files of size 0 bytes) before archiving because we don’t need that stuff in the archives. Cleaning up these files before the archiving prevents a lot of cluttering.

2 Delete the files that have been archived. So instead of making a backup I would like the option to move files to an archive like option –m in pkzip or arj long time ago. From what I have seen that is not the intention of your script.

BTW I found an option to use zip from windows directly in this script about SharePoint backups http://spbackup.codeplex.com/ In some cases 7-zip is not needed at all but I guess it is far more capable than the build in routines.

The SPBackup script uses some interesting features with XML configuration an small and effective functions to e-mail and for logging. (As of today we use this script for SharePoint 2007 backups.)

Regarding junction.exe (V1.05)

I just tested the claim that deleting the junction created with “junction.exe c:\temp\mylink c:\temp.tst\mylinkeddir” would also delete de the file(s) and or subfolders it is pointing to and found that nothing is lost except the link. (Tested on Win XP SP3 en-us)

After deleting the link in Total Commander as well as in Explorer nothing was actually deleted except the link itself. Should be reassuring J

Thank you.

A lot of other stuff to think about but when this works it will help a lot of administrators to the their jobs better. I hope you understand my explanations and are willing to consider my proposals.

Thanks a lot and hope to mail you soon.

Kind regards,

Gert-Wim Scheppink

Read the full discussion online.

To add a post to this discussion, reply to this email (7zbackup@discussions.codeplex.com)

To start a new discussion for this project, email 7zbackup@discussions.codeplex.com

You are receiving this email because you subscribed to this discussion on CodePlex. You can unsubscribe on CodePlex.com.

Please note: Images and attachments will be removed from emails. Any posts to this discussion will also be available online at CodePlex.com

Coordinator
Feb 3, 2010 at 12:39 PM

Hi Gert,

of course your suggestion to use soft-links is interesting. Just tested 7-zip to scan files through a soft-linked remote UNC share and it works. Therefore I will add support for that. The only drawback is due to performance: reading file data by the means of a network share is, with all evidence, slower. Nevertheless using soft-links it's possible to concentrate backup processes on one single machine with the appropriate availability of UNC paths to connect to. Just added in the to-do list.

Switching to "jobs" ... at present the script works as an "atomic" procedure: it receives as input parameters (at least) the file name to generate, the type, and the file-selection-criteria then runs and stops. Additional arguments can be passed via CLI also but it's likely to have them hardcoded in a separate file (hardcoded vars). If I understand well you have in mind some sort of modification that can instruct the script to execute a "batch" of jobs instead of a single one and stop. In other words: launch the script and have it to generate this archive with this selection criteria and this notify addresses, then (if any) create a second archive with a different name, a different selection criteria and notify other people and ... so on. Am I right ? If this is the case ... well there might be some work to do but ... I already had in mind something similar. Still working on some ideas. However I would not like to mess up with XML files: even they're very flexible they're a bit tricky to manage for unexperienced users (it's enough not to close an open tag). Plain text files with lots of comments before a directive are my choice (love how *NIX guys do their job).

Deletions. Let's split the problem. Deletion of "unuseful files" like *.tmp or temporary Word lock files (~WRL....) can be easily implemented. I think I will add, within the selection file, one more directive that will instruct the script to delete certain file patterns during the selection process. For what is about the second aspect I think I can add a new "backup method" named "MOVE" (at present we have "FULL", "INCR", "DIFF", "COPY"). By the means of this new method the procedure of "clearing archive bit" can be easily replaced by a deletion of the succesfully archived files. There is only one question to answer: "FULL" and "COPY" methods select all files regardless their Archive attribute while "INCR" and "DIFF" do consider the "Archive" attribute. I think that a MOVE operation may behave like "COPY" method so selecting all files matching the criteria regardless their Archive attribute.

I saw Spbackup uses "native" Windows zip: it simply creates a new file with hex headers of type PKZIP and then let the Windows API to do the rest with CopyHere operations. It has a limit though: Windows API can't handle large compressed archives (more than 4Gb) as it does not support ZIP64. As our backups are, in average, over 30GB ... well native zip support is not an option.

Two words on your experience of a deletion of a Junction point not being traversed. Well I had to put that disclaimer as on my Windows Vista and on one Windows 2003, while deleting a junction point,  I sadly scratched huge amounts of important data (eventually I got them back from a backup). So I prefer to keep me and the users warned about the brutal deletion of a junction point.

 

Feb 4, 2010 at 1:32 AM

Anlan,

See my comments below.

From: Anlan [mailto:notifications@codeplex.com]
Sent: woensdag 3 februari 2010 13:40
To: codeplex@scheppink.com
Subject: Re: RE: Failed to post to CodePlex.com project 7zbackup [7zbackup:82945]

From: Anlan

Hi Gert,

of course your suggestion to use soft-links is interesting. Just tested 7-zip to scan files through a soft-linked remote UNC share and it works. Therefore I will add support for that. The only drawback is due to performance: reading file data by the means of a network share is, with all evidence, slower. Nevertheless using soft-links it's possible to concentrate backup processes on one single machine with the appropriate availability of UNC paths to connect to. Just added in the to-do list.

Thanks fort hat. Whe realise that doing these kind of things over the LAN will be a bit slower but in practice it is not that bad on Gb LAN. We move the selected files in one step to the archiving server where the rest of the job will take place on local disc(s).

Switching to "jobs" ... at present the script works as an "atomic" procedure: it receives as input parameters (at least) the file name to generate, the type, and the file-selection-criteria then runs and stops. Additional arguments can be passed via CLI also but it's likely to have them hardcoded in a separate file (hardcoded vars). If I understand well you have in mind some sort of modification that can instruct the script to execute a "batch" of jobs instead of a single one and stop. In other words: launch the script and have it to generate this archive with this selection criteria and this notify addresses, then (if any) create a second archive with a different name, a different selection criteria and notify other people and ... so on. Am I right ? If this is the case ... well there might be some work to do but ... I already had in mind something similar. Still working on some ideas. However I would not like to mess up with XML files: even they're very flexible they're a bit tricky to manage for unexperienced users (it's enough not to close an open tag). Plain text files with lots of comments before a directive are my choice (love how *NIX guys do their job).

You understand correctly and I would even prefer a job-list as you describe in a plain readable format (XML is not the most readable option. CSV TAB or semicolon seprated would do nice and stay readable)

In my opinion it would be better to have a wrapper around your script taking care of the jobs one by one. In case some jobs goes wrong it should not stop the process for all remaining jobs.

Deletions. Let's split the problem. Deletion of "unuseful files" like *.tmp or temporary Word lock files (~WRL....) can be easily implemented. I think I will add, within the selection file, one more directive that will instruct the script to delete certain file patterns during the selection process. For what is about the second aspect I think I can add a new "backup method" named "MOVE" (at present we have "FULL", "INCR", "DIFF", "COPY"). By the means of this new method the procedure of "clearing archive bit" can be easily replaced by a deletion of the succesfully archived files. There is only one question to answer: "FULL" and "COPY" methods select all files regardless their Archive attribute while "INCR" and "DIFF" do consider the "Archive" attribute. I think that a MOVE operation may behave like "COPY" method so selecting all files matching the criteria regardless their Archive attribute.

This would fit our needs and I cannot think of another way to do this. Archiving (with MOVE) should move all files matching the criteria (based on Reg exp ‘mask’ and filter by age. Creation Date would do for us but maybe last accessed Date would be nice to archive file not used for an amount of time.

I saw Spbackup uses "native" Windows zip: it simply creates a new file with hex headers of type PKZIP and then let the Windows API to do the rest with CopyHere operations. It has a limit though: Windows API can't handle large compressed archives (more than 4Gb) as it does not support ZIP64. As our backups are, in average, over 30GB ... well native zip support is not an option.

I did not know about these limitations and guess 7-zip is much better for these purposes.

Two words on your experience of a deletion of a Junction point not being traversed. Well I had to put that disclaimer as on my Windows Vista and on one Windows 2003, while deleting a junction point, I sadly scratched huge amounts of important data (eventually I got them back from a backup). So I prefer to keep me and the users warned about the brutal deletion of a junction point.

I have seen warnings of this kind on many webpages and I know it is a dangerous tool both on windows and *nix. Hard links behave like you describe and it is difficult for a user to distinguish between them. I totally agree about the warning you give but it hould not scare people away.


Now it comes to my mind that I do not fully understand why you use the junction tool. On a local system you could use “subst” but don’t see why (probably to prevent the usage of UNC paths). On remote drives “net use” to temporarily map to a share or subfolder of that share. We use this in our environment for the current archiving . This should do the trick too without ever using junction.exe.

You could test for this behavior in the script and warn users about this.

Test should be something like:

Test if junction.exe is found  (done already)

Create a folder x:\junctest.fld & create a second folder x:\junctest.tgt\MyJunc.

Now use junction.exe x:\junctest.fld\juncsrc x:\junction.tgt\MyJunc

Use an OS routine to delete x:\junctest.fld\juncsrc                       If x:\junction.tgt\MyJunc disappeared call Houston (because we have a potential problem)

Junction /D should not have the issue and cleanup nicely. Mklink does a better job but will only run on Vista and more recent OS’s

Kind regards and I hope to hear from you soon.

Gert-Wim Scheppink

If I can be of any help just let me know. I am new to the PowerShell scripting but besides that I have about 15 years of experience in systems administration of all kinds.

Read the full discussion online.

To add a post to this discussion, reply to this email (7zbackup@discussions.codeplex.com)

To start a new discussion for this project, email 7zbackup@discussions.codeplex.com

You are receiving this email because you subscribed to this discussion on CodePlex. You can unsubscribe on CodePlex.com.

Please note: Images and attachments will be removed from emails. Any posts to this discussion will also be available online at CodePlex.com

Coordinator
Feb 19, 2010 at 7:04 PM
gwscheppink wrote:

Now it comes to my mind that I do not fully understand why you use the junction tool. On a local system you could use “subst” but don’t see why (probably to prevent the usage of UNC paths). On remote drives “net use” to temporarily map to a share or subfolder of that share. We use this in our environment for the current archiving . This should do the trick too without ever using junction.exe.

You could test for this behavior in the script and warn users about this.

Hi Gert, hope everything is ok on your side.

I hope you'll appreciate some further implementations I made to the script following your suggestions. (sorry for the project to develop so slowly but ... I am following it in my free-time).

To answer your question about the use of junctions instead of NET USE or SUBST... This is due to how 7-zip behaves when scanning directories in search for files to archive. Imagine the following scenario: you have two directories on your source drive named \dir1 and \dir2. Each one of those contains (just an example) a file named sample.txt so your tree command will report something like this:

C:.
├───Dir1
│       Sample.Txt

└───Dir2
        Sample.Txt

You want to archive all these contents in a single compressed file, with a single pass ... well if you pass 7-zip a list of directories like C:\Dir1\* and C:\Dir2\* you will receive back a Duplicate File Name error (as sample.txt is encountered twice).
The "problem" is well known to 7-zip's developer Igor Pavlov: he says it's a behavior by design which can be bypassed making 7-zip search "relative" to the files.
In other words you should "group" your multiple sources under a "master root" and make the scan relative. Like this:

C:.
└───MasterDir
    ├───Dir1
    │       Sample.Txt
    │
    └───Dir2
            Sample.Txt

In this way, you enter MasterDir and make it the working directory  threfore passing 7-zip a list like this:

Dir1\Sample.txt
Dir2\Sample.txt

This overrides the problem of duplicate file name. The only option I had to "rewrite" temporarily the file system during backup operations ... was to drop-in junction points. In such way you can create a customized "view" of your file-system and let 7-zip scan from a single root. SUBST, can't do this as it lets you map one drive letter to a single path, just as well as NET USE.

Hope my explanation makes sense to you.