Amazon Web Services (AWS) offers some very affordable archive storage via it’s S3 Glacier service. I’ve used this on a backup account in the past to store archives, and have decided it’s time to clear down this account (oh, and save $0.32 a month in doing so).
The main challenge with doing this, is that unlike S3, S3 Glacier (objects stored directly there rather than using the Glacier storage tier within S3) objects can only be deleted via the AWS CLI. And to delete a Glacier Vault, you’ve got to delete all of the objects.
In this post I’ll spin up a Lightsail box and wipe out the pesky Glacier objects through the AWS CLI. This doesn’t require any changes on your local PC, but will require some patience.
There’s often a discussion about what each of these autonumber/hash functions does in Qlik. We commonly see these functions used for creating integer key fields, anonymising data (for example, names in demo apps), and maintaining long string fields for later comparison (as the hashes are shorter than the strings they replace).
To do this, I’m using the script below. I’m also keen to show outputs from QlikView vs Qlik Sense, and results of running the same script on another machine.
My observations are the following:
– AutoNumber/AutoNumberHash128/256 – different output per load as the value is typically based on the load order of the source data
– Hash128/160/256 – the same output, across every load. Stays the same between Qlik Sense and QlikView, and also between different machines
I recently spotted an unexpected slow-down in a load script, which was caused by using one of these functions. In summary:
– Using RowNo() in a simple load script is considerably slower than RecNo()
– If you must use RecNo(), it may be faster to do this in a direct load
– If you must use RowNo(), it may be faster to do this in a resident load
If you’ve got a couple of VMs, then trimming them occasionally will help with storage management on the host. Disk files (like VDI and VMDK images) will grow over time if the VM was configured to expand the disk as needed – but they will never shrink on their own.
This was tested on a Windows 10 VM managed by VirtualBox, the starting file size was over 40GB, as it had been used for various roles and grown over time.
In this post I explore the outputs of RecNo() and RowNo() to demonstrate the difference visually.
These two fields are often used interchangeably, but they provide different output. In summary:
– RecNo() is a record number based on source table(s)
– RowNo() is a record number based on the resulting table
As a result, RowNo will always provide a unique integer per row in an output table, while RecNo is only guaranteed to be unique when a single source is loaded (RecNo is based on each single source table, interpreted individually rather than collectively).
Structured data is very familiar to most people, as it’s what is captured by most user-facing business systems. You’ve got columns and rows, and data values are stored against a field name and identifying value of some type, with a clear data type.