Archiware P5 scripts

Background: I manage a few backup and archive servers for clients and these run the Archiware P5 suite of software (archive, backup and sync). To help manage these servers over the years I’ve written some P5 monitoring tools for Watchman and MunkiReport as well worked on helper scripts using the cli tools or more recently the API.

In an effort to share some simple examples of what is possible I have organized a few samples from my GitHub repos on the code.matx.ca page with some useful descriptions and text about usage and purpose of the scripts. They are hosted on in repo here:

https://github.com/macvfx/Archiware

I have more scripts of general and specific interest in my repo to P5 users or anyone managing files. See this post on find scripts.

The P5 code toolbox

The following P5 scripts are just a few examples and I have more to share, if there’s interest. I created most of these simple tools initially to run on the P5 server directly but I have since created, for my clients, versions which run from anywhere. Also, in some cases, a few scripts have now been built into easy to use Mac applications where it makes sense. If you want some help and you want to hire me to help with these things please reach out.

The scripts are in three categories: 

1) P5 archive intelligence (all archive jobs from Db exported as a spreadsheet, or get recent archive jobs via REST API),

2) P5 housekeeping (make all full tapes read only, show all appendable archive tapes), and 

3) P5 info backup (export all volumes into one csv, and export all volumes inventory as TSV with barcode as the name)

P5 Archive Intelligence

What do I mean by “archive intelligence”? Simply, I want to know about everything I’ve archived. One should consider the P5 Archive server as the ultimate source for all things archived but in some cases my clients don’t use the P5 server directly, or they want the information organized differently, like in a spreadsheet. And while the Archiware P5 suite of software is ever evolving, growing and adding features (even some lovely visual dashboards in v.7.4) I have been attempting to solve the perennial question of “what do we have archived?” in better and more useful ways.

P5 Information Backup

Related to archive intelligence is knowing what is in my P5 archive system entirely. I modified a provided shell script from the Archiware cli manual to output a csv of a list of all P5 volumes in the tape library (aka jukebox) so that I might know what is in the system at all times, and even if my P5 server is not running I have a record of every tape. This is one of the scripts I run with my periodic and backup workflows but more on my own special P5 backups (backups of the P5 Db and other metadata) in another post. The Archiware provided P5 volume list script inspired my own script to list full and appendable archive tapes which I have as a one-click desktop app for my clients. When they want to restore something P5 will tell them what to put in the tape library but if you have a lot of tapes maybe you want to know what to remove and so I have a list of candidates (ie take out the full tapes, and leave the appendable archive tapes). Helpful, yes.

P5 Inventory

There is a P5 cli command to export out the complete inventory and depending on how long you’ve been archiving and how much is in your archive this tool can take a long time to export a list of every file ever. And because of my mostly non-advanced super skills some times I’d find this process would time out. (There are ways around this but that’s another post). Basically, Too much archive! When it didn’t error out I had a big file… so one day a friend of mine suggested we use Jupyter notebook and yes, use Python!, to do some data analysis. A really fun project, great tool, but this is a hard problem to solve. We made a thing, it worked for a while then I wanted to find a better way. People liked my bar graphs and total amount archived but they also wanted spreadsheets. So let’s give them what they want.

Two (or three) approaches:

  • cli
  • api
  • db

I like lower case letters which are acronyms but let’s explore further.

cli

Using the cli (command line interface) usually in a shell script (but also in clever Mac apps) typically requires running the script with the Archiware P5 cli (nsdchat) locally on the P5 server and certainly this is what I did when I was testing scripts and various tools. It makes sense if you’re administering a server and you remote in (ssh or screenshare) and that’s where you start. After I wile I discovered the trick to make these cli based scripts run from anywhere which was handy if I wanted to connect to all my p5 servers at once in a script or have my client use an app on their desktop which talks to the P5 server. More on awsock in another post.

An example of a cli script using nsdchat I have a script for taking the inventory contents of each archive tape (LTO) and writing its contents to a TSV file (tab separated values) which is like a CSV (comma separated values).

nsdchat Volume names

Give me a list of p5 volumes (ie tapes)and tell me which one are archive tapes and that are readonly and what is the barcode.

api

Ok instead of a cli command dependent on the nsdchat binary installed somewhere we are doing http (web) magic with the API (application programming interface) — a set of known commands in a path based on GET, PUT, POST, DELETE. The API has a different way of doing things than the cli but you can ask a lot of the same basic questions.

In my api-archive-overview script I am sending one command then using jq to select elements to organize the info into a csv (spreadsheet). This example is set to run locally but this is easily modifiable to run from anywhere. For one client I have a script that talks to every P5 server, each in a different city and asks them all what they’ve been archiving then organizes it all into one spreadsheet. It’s fun, and it’s helpful.

db

The best for last. I mentioned above my attempts to use the inventory command which itself goes to all the relavant databases and gathers all the requested data about every archive file, its size and when it was archived etc. Yeah, that’s one way to do it. I’ve shown two examples above for the cli and api but a third is to just talk to the database directly. This is an advanced technique and should only be attempted by an expert. Ok, I’m kidding. As long as you’re not writing to the database and only reading from it, this is pretty safe. What you do with the info is another magic trick and one which I’ve been working on. Two db examples are dump all jobs in a csv file and the second, dump only archive jobs. I’ve got a more advanced script which takes the data and uses sql commands to organize into a csv of how much data archived per day per week per month per year and totals, which is nice for some people who like spreadsheets and want to know about everything ever done. Caution: once you look into the Db you’ll see a lot of things, and sorting through it takes time. I found when making more advanced and selective scripts that the cli jobs used by the very old P5 Archive app (by Andre Aulich) for example showed up as system jobs not archive jobs, so you have to be careful if you want to include those. Have fun.

P5 Housekeeping

Finally, some housekeeping scripts are included in the example repo, like the script to make all archive tapes that are “full” to be marked “read only” which is handy is you also have a script to only export the contents of each tapes from archive tapes marked readonly so many little scripts to do little things.

P5 Archive Prep scripts

There’s another category of scripts which I haven’t elaborated on but I do have a few examples in my repo. I do have scripts to prepare or examine files and folders going to be archived to LTO with P5 Archive. These scripts do various things like check for trailing space in the name or check file name length but maybe the most important ones I have are scripts which take the path of the archived projects and create html maps, file size directory listings and spreadsheets (again!) of the exif data of all files to keep for future. Clients do refer to the archive stub files (p5c) but they also find it handy to see the directory map and the file size of archived items without going into the p5 server. I’m not trying to replicate the P5 server, or replace it, but this falls into p5 housekeeping and p5 information backup.

That’s enough for now. If you’ve been reading and following along then let me know if you any questions or want any help with a P5 or other related projects. If you have better ways to do these things feel free to share. My scripts are always evolving and I love to learn.

Reference:

For more info on Archiware P5 scripting and building code to interact with it I’d recommend checking out the main P5 manual, as well as the CLI (command line interface) manual, and the API documentation, knowledge base (support). As well as the sample scripts and the Archiware blog, and the video series generally.